File Coverage

File:blib/lib/Database/Abstraction.pm
Coverage:64.5%

linestmtbrancondsubtimecode
1package Database::Abstraction;
2
3# Author Nigel Horne: njh@nigelhorne.com
4# Copyright (C) 2015-2025, Nigel Horne
5
6# Usage is subject to licence terms.
7# The licence terms of this software are as follows:
8# Personal single user, single computer use: GPL2
9# All other users (for example, Commercial, Charity, Educational, Government)
10#       must apply in writing for a licence for use from Nigel Horne at the
11#       above e-mail.
12
13# TODO: Switch "entry" to off by default, and enable by passing 'entry'
14#       though that wouldn't be so nice for AUTOLOAD
15# TODO: support a directory hierarchy of databases
16# TODO: consider returning an object or array of objects, rather than hashes
17# TODO: Add redis database - could be of use for Geo::Coder::Free
18#       use select() to select a database - use the table arg
19#       new(database => 'redis://servername');
20# TODO: Add a "key" property, defaulting to "entry", which would be the name of the key
21# TODO: The maximum number to return should be tuneable (as a LIMIT)
22# TODO: Add full CRUD support
23# TODO: It would be better for the default sep_char to be ',' rather than '!'
24# FIXME:        t/xml.t fails in slurping mode
25# TODO: Other databases e.g., Redis, noSQL, remote databases such as MySQL, PostgreSQL
26# TODO: The no_entry/entry terminology is confusing.  Replace with no_id/id_column
27# TODO: Add support for DBM::Deep
28# TODO: Log queries and the time that they took to execute per database
29
30
16
16
16
2001356
15
310
use warnings;
31
16
16
16
27
10
135
use strict;
32
33
16
16
16
2356
6824
26
use boolean;
34
16
16
16
476
14
330
use Carp;
35
16
16
16
22
16
204
use Data::Dumper;
36
16
16
16
3084
41618
432
use Data::Reuse;
37
16
16
16
3311
205138
1752
use DBD::SQLite::Constants qw/:file_open/;      # For SQLITE_OPEN_READONLY
38
16
16
16
82
16
1572
use Fcntl;      # For O_RDONLY
39
16
16
16
36
10
207
use File::Spec;
40
16
16
16
2710
3532
335
use File::pfopen 0.03;  # For $mode and list context
41
16
16
16
5333
98576
555
use File::Temp;
42
16
16
16
3038
430273
274
use Log::Abstraction 0.26;
43
16
16
16
3038
61909
250
use Object::Configure 0.16;
44
16
16
16
34
71
229
use Params::Get 0.13;
45# use Error::Simple;    # A nice idea to use this, but it doesn't play well with "use lib"
46
16
16
16
36
17
281
use Scalar::Util;
47
48our %defaults;
49
16
16
16
28
29
52744
use constant    DEFAULT_MAX_SLURP_SIZE => 16 * 1024; # CSV files <= than this size are read into memory
50
51 - 59
=head1 NAME

Database::Abstraction - Read-only Database Abstraction Layer (ORM)

=head1 VERSION

Version 0.33

=cut
60
61our $VERSION = '0.33';
62
63 - 193
=head1 DESCRIPTION

C<Database::Abstraction> is a read-only database abstraction layer (ORM) for Perl,
designed to provide a simple interface for accessing and querying various types of databases such as CSV, XML, and SQLite without the need to write SQL queries.
It promotes code maintainability by abstracting database access logic into a single interface,
allowing users to switch between different storage formats seamlessly.
The module supports caching for performance optimization,
flexible logging for debugging and monitoring,
and includes features like the AUTOLOAD method for convenient access to database columns.
By handling numerous database and file formats,
C<Database::Abstraction> adds versatility and simplifies the management of read-intensive applications.

=head1 SYNOPSIS

Abstract class giving read-only access to CSV,
XML,
BerkeleyDB and SQLite databases via Perl without writing any SQL,
using caching for performance optimization.

The module promotes code maintainability by abstracting database access logic into a single interface.
Users can switch between different storage formats without changing application logic.
The ability to handle numerous database and file formats adds versatility and makes it useful for a variety of applications.

It's a simple ORM like interface which,
for all of its simplicity,
allows you to do a lot of the heavy lifting of simple database operations without any SQL.
It offers functionalities like opening the database and fetching data based on various criteria.

Built-in support for flexible and configurable caching improves performance for read-intensive applications.

Supports logging to debug and monitor database operations.

Look for databases in $directory in this order:

=over 4

=item 1 C<SQLite>

File ends with .sql

=item 2 C<PSV>

Pipe separated file, file ends with .psv

=item 3 C<CSV>

File ends with .csv or .db, can be gzipped. Note the default sep_char is '!' not ','

=item 4 C<XML>

File ends with .xml

=item 5 C<BerkeleyDB>

File ends with .db

=back

The AUTOLOAD feature allows for convenient access to database columns using method calls.
It hides the complexity of querying the underlying data storage.

If the table has a key column,
entries are keyed on that and sorts are based on it.
To turn that off, pass 'no_entry' to the constructor, for legacy
reasons it's enabled by default.
The key column's default name is 'entry', but it can be overridden by the 'id' parameter.

Arrays are made read-only before being returned.
To disable that, pass C<no_fixate> to the constructor.

CSV files that are not no_entry can have empty lines or comment lines starting with '#',
to make them more readable.

=head1 EXAMPLE

If the file /var/dat/foo.csv contains something like:

    "customer_id","name"
    "plugh","John"
    "xyzzy","Jane"

Create a driver for the file in .../Database/foo.pm:

    package Database::foo;

    use Database::Abstraction;

    our @ISA = ('Database::Abstraction');

    # Regular CSV: There is no entry column and the separators are commas
    sub new
    {
        my $class = shift;
        my %args = (ref($_[0]) eq 'HASH') ? %{$_[0]} : @_;

        return $class->SUPER::new(no_entry => 1, sep_char => ',', %args);
    }

You can then use this code to access the data via the driver:

    # Opens the file, e.g. /var/dat/foo.csv
    my $foo = Database::foo->new(directory => '/var/dat');

    # Prints "John"
    print 'Customer name ', $foo->name(customer_id => 'plugh'), "\n";

    # Prints:
    #  $VAR1 = {
    #     'customer_id' => 'xyzzy',
    #     'name' => 'Jane'
    #  };
    my $row = $foo->fetchrow_hashref(customer_id => 'xyzzy');
    print Data::Dumper->new([$row])->Dump();

=head1 SUBROUTINES/METHODS

=head2 init

Initializes the abstraction class and its subclasses with optional arguments for configuration.

    Database::Abstraction::init(directory => '../data');

See the documentation for new to see what variables can be set.

Returns a reference to a hash of the current values.
Therefore when given with no arguments you can get the current default values:

    my $defaults = Database::Abstraction::init();
    print $defaults->{'directory'}, "\n";

=cut
194
195# Subroutine to initialize with args
196sub init
197{
198
46
465825
        if(my $params = Params::Get::get_params(undef, @_)) {
199
44
454
                if(($params->{'expires_in'} && !$params->{'cache_duration'})) {
200                        # Compatibility with CHI
201
1
1
                        $params->{'cache_duration'} = $params->{'expires_in'};
202                }
203
204
44
44
38
93
                %defaults = (%defaults, %{$params});
205
44
144
                $defaults{'cache_duration'} ||= '1 hour';
206        }
207
208
46
875
        return \%defaults
209}
210
211 - 221
=head2 import

The module can be initialised by the C<use> directive.

    use Database::Abstraction 'directory' => '/etc/data';

or

    use Database::Abstraction { 'directory' => '/etc/data' };

=cut
222
223sub import
224{
225
38
1915
        my $pkg = shift;
226
227
38
65
        if((scalar(@_) % 2) == 0) {
228
38
39
                my %h = @_;
229
38
77
                init(Object::Configure::configure($pkg, \%h));
230        } elsif((scalar(@_) == 1) && (ref($_[0]) eq 'HASH')) {
231
0
0
                init(Object::Configure::configure($pkg, $_[0]));
232        } elsif(scalar(@_) > 0) {    # >= 3 would also work here
233
0
0
                init(\@_);
234        }
235}
236
237 - 316
=head2 new

Create an object to point to a read-only database.

Arguments:

Takes different argument formats (hash or positional)

=over 4

=item * C<auto_load>

Enable/disable the AUTOLOAD feature.
The default is to have it enabled.

=item * C<cache>

Place to store results

=item * C<cache_duration>

How long to store results in the cache (default is 1 hour).

=item * C<config_file>

Points to a configuration file which contains the parameters to C<new()>.
The file can be in any common format including C<YAML>, C<XML>, and C<INI>.
This allows the parameters to be set at run time.

=item * C<expires_in>

Synonym of C<cache_duration>, for compatibility with C<CHI>.

=item * C<dbname>

The prefix of name of the database file (default is name of the table).
The database will be held in a file such as $dbname.csv.

=item * C<directory>

Where the database file is held.
If only one argument is given to C<new()>, it is taken to be C<directory>.

=item * C<filename>

Filename containing the data.
When not given,
the filename is derived from the tablename
which in turn comes from the class name.

=item * C<logger>

Takes an optional parameter logger, which is used for warnings and traces.
Can be an object that understands warn() and trace() messages,
such as a L<Log::Log4perl> or L<Log::Any> object,
a reference to code,
or a filename.

=item * C<max_slurp_size>

CSV/PSV/XML files smaller than this are held in a HASH in RAM (default is 16K),
falling back to SQL on larger data sets.
Setting this value to 0 will turn this feature off,
thus forcing SQL to be used to access the database

=back

If the arguments are not set, tries to take from class level defaults.

Checks for abstract class usage.

Slurp mode assumes that the key column (entry) is unique.
If it isn't, searches will be incomplete.
Turn off slurp mode on those databases,
by setting a low value for max_slurp_size.

Clones existing objects with or without modifications.
Uses Carp::carp to log warnings for incorrect usage or potential mistakes.

=cut
317
318sub new {
319
35
93295
        my $class = shift;
320
35
25
        my %args;
321
322        # Handle hash or hashref arguments
323
35
98
        if((scalar(@_) == 1) && !ref($_[0])) {
324
4
7
                $args{'directory'} = $_[0];
325        } elsif(my $params = Params::Get::get_params(undef, @_)) {
326
24
24
275
37
                %args = %{$params};
327        }
328
329
35
142
        if(!defined($class)) {
330
0
0
                if((scalar keys %args) > 0) {
331                        # Using Database::Abstraction->new(), not Database::Abstraction::new()
332
0
0
                        carp(__PACKAGE__, ' use ->new() not ::new() to instantiate');
333
0
0
                        return;
334                }
335                # FIXME: this only works when no arguments are given
336
0
0
                $class = __PACKAGE__;
337        } elsif($class eq __PACKAGE__) {
338
1
12
                croak("$class: abstract class");
339        } elsif(Scalar::Util::blessed($class)) {
340                # If $class is an object, clone it with new arguments
341
2
2
1
12
                return bless { %{$class}, %args }, ref($class);
342        }
343
344        # Load the configuration from a config file, if provided
345
32
32
21
45
        %args = %{Object::Configure::configure($class, \%args)};
346
347        # Validate logger object has required methods
348
32
46988
        if(defined $args{'logger'}) {
349
32
191
                unless(Scalar::Util::blessed($args{'logger'}) && $args{'logger'}->can('info') && $args{'logger'}->can('error')) {
350
0
0
                        Carp::croak("Logger must be an object with info() and error() methods");
351                }
352        }
353
354
32
80
        croak("$class: where are the files?") unless($args{'directory'} || $defaults{'directory'});
355
356
31
183
        croak("$class: ", $args{'directory'} || $defaults{'directory'}, ' is not a directory') unless(-d ($args{'directory'} || $defaults{'directory'}));
357
358        # init(\%args);
359
360        # return bless {
361                # logger => $args{'logger'} || $logger,
362                # directory => $args{'directory'} || $directory,     # The directory containing the tables in XML, SQLite or CSV format
363                # cache => $args{'cache'} || $cache,
364                # cache_duration => $args{'cache_duration'} || $cache_duration || '1 hour',
365                # table => $args{'table'},   # The name of the file containing the table, defaults to the class name
366                # no_entry => $args{'no_entry'} || 0,
367        # }, $class;
368
369        # Re-seen keys take precedence, so defaults come first
370        # print STDERR ">>>>>>>>>>>>>>>>>>>\n";
371        # print STDERR __LINE__, "\n";
372        # print STDERR $args{'id'} || 'undef';
373        # print STDERR "\n";
374
30
132
        return bless {
375                no_entry => 0,
376                no_fixate => 0,
377                id => 'entry',
378                cache_duration => '1 hour',
379                max_slurp_size => DEFAULT_MAX_SLURP_SIZE,
380                %defaults,
381                %args,
382        }, $class;
383}
384
385 - 389
=head2  set_logger

Sets the class, code reference, or file that will be used for logging.

=cut
390
391sub set_logger
392{
393
6
1820
        my $self = shift;
394
6
8
        my $params = Params::Get::get_params('logger', @_);
395
396
5
56
        if(my $logger = $params->{'logger'}) {
397
5
6
                if(Scalar::Util::blessed($logger)) {
398
2
8
                        $self->{'logger'} = $logger;
399                } else {
400
3
5
                        $self->{'logger'} = Log::Abstraction->new($logger);
401                }
402
5
50
                return $self;
403        }
404
0
0
        Carp::croak('Usage: set_logger(logger => $logger)')
405}
406
407# Open the database connection based on the specified type (e.g., SQLite, CSV).
408# Read the data into memory or establish a connection to the database file.
409# column_names allows the column names to be overridden on CSV files
410
411sub _open
412{
413
17
34
        if(!UNIVERSAL::isa((caller)[0], __PACKAGE__)) {
414
0
0
                Carp::croak('Illegal Operation: This method can only be called by a subclass');
415        }
416
417
17
167
        my $self = shift;
418
17
31
        my $params = Params::Get::get_params(undef, @_);
419
420
17
183
        $params->{'sep_char'} ||= $self->{'sep_char'} ? $self->{'sep_char'} : '!';
421
17
25
        my $max_slurp_size = $params->{'max_slurp_size'} || $self->{'max_slurp_size'};
422
423
17
47
        my $table = $self->{'table'} || ref($self);
424
17
31
        $table =~ s/.*:://;
425
426
17
46
        $self->_trace(ref($self), ": _open $table");
427
428
17
632
        return if($self->{$table});
429
430        # Read in the database
431
17
13
        my $dbh;
432
433
17
26
        my $dir = $self->{'directory'} || $defaults{'directory'};
434
17
57
        my $dbname = $self->{'dbname'} || $defaults{'dbname'} || $table;
435
17
109
        my $slurp_file = File::Spec->catfile($dir, "$dbname.sql");
436
437
17
47
        $self->_debug("_open: try to open $slurp_file");
438
439        # Look at various places to find the file and derive the file type from the file's name
440
17
623
        if(-r $slurp_file) {
441                # SQLite file
442
0
0
                require DBI && DBI->import() unless DBI->can('connect');
443
444
0
0
                $dbh = DBI->connect("dbi:SQLite:dbname=$slurp_file", undef, undef, {
445                        sqlite_open_flags => SQLITE_OPEN_READONLY,
446                });
447        }
448
17
113
        if($dbh) {
449
0
0
                $dbh->do('PRAGMA synchronous = OFF');
450
0
0
                $dbh->do('PRAGMA cache_size = -4096');       # Use 4MB cache - negative = KB)
451
0
0
                $dbh->do('PRAGMA journal_mode = OFF');       # Read-only, no journal needed
452
0
0
                $dbh->do('PRAGMA temp_store = MEMORY');      # Store temp data in RAM
453
0
0
                $dbh->do('PRAGMA mmap_size = 1048576');      # Use 1MB memory-mapped I/O
454
0
0
                $dbh->sqlite_busy_timeout(100000);   # 10s
455
0
0
                $self->_debug("read in $table from SQLite $slurp_file");
456
0
0
                $self->{'type'} = 'DBI';
457        } elsif($self->_is_berkeley_db(File::Spec->catfile($dir, "$dbname.db"))) {
458
0
0
                $self->_debug("$table is a BerkeleyDB file");
459
0
0
                $self->{'type'} = 'BerkeleyDB';
460        } else {
461
17
15
                my $fin;
462
17
34
                ($fin, $slurp_file) = File::pfopen::pfopen($dir, $dbname, 'csv.gz:db.gz', '<');
463
17
475
                if(defined($slurp_file) && (-r $slurp_file)) {
464
0
0
                        require Gzip::Faster;
465
0
0
                        Gzip::Faster->import();
466
467
0
0
                        close($fin);
468
0
0
                        $fin = File::Temp->new(SUFFIX => '.csv', UNLINK => 0);
469
0
0
                        print $fin gunzip_file($slurp_file);
470
0
0
                        $slurp_file = $fin->filename();
471
0
0
                        $self->{'temp'} = $slurp_file;
472                } else {
473
17
27
                        ($fin, $slurp_file) = File::pfopen::pfopen($dir, $dbname, 'psv', '<');
474
17
338
                        if(defined($fin)) {
475                                # Pipe separated file
476
4
6
                                $params->{'sep_char'} = '|';
477                        } else {
478                                # CSV file
479
13
28
                                ($fin, $slurp_file) = File::pfopen::pfopen($dir, $dbname, 'csv:db', '<');
480                        }
481                }
482
17
336
                if(my $filename = $self->{'filename'} || $defaults{'filename'}) {
483
1
10
                        $self->_debug("Looking for $filename in $dir");
484
1
14
                        $slurp_file = File::Spec->catfile($dir, $filename);
485                }
486
17
91
                if(defined($slurp_file) && (-r $slurp_file)) {
487
14
76
                        close($fin) if(defined($fin));
488
14
16
                        my $sep_char = $params->{'sep_char'};
489
490
14
38
                        $self->_debug(__LINE__, ' of ', __PACKAGE__, ": slurp_file = $slurp_file, sep_char = $sep_char");
491
492
14
604
                        if($params->{'column_names'}) {
493                                $dbh = DBI->connect("dbi:CSV:db_name=$slurp_file", undef, undef,
494                                        {
495                                                csv_sep_char => $sep_char,
496                                                csv_tables => {
497                                                        $table => {
498
1
5
                                                                col_names => $params->{'column_names'},
499                                                        },
500                                                },
501                                        }
502                                );
503                        } else {
504
13
53
                                $dbh = DBI->connect("dbi:CSV:db_name=$slurp_file", undef, undef, { csv_sep_char => $sep_char});
505                        }
506
14
525686
                        $dbh->{'RaiseError'} = 1;
507
508
14
83
                        $self->_debug("read in $table from CSV $slurp_file");
509
510
14
708
                        $dbh->{csv_tables}->{$table} = {
511                                allow_loose_quotes => 1,
512                                blank_is_undef => 1,
513                                empty_is_undef => 1,
514                                binary => 1,
515                                f_file => $slurp_file,
516                                escape_char => '\\',
517                                sep_char => $sep_char,
518                                # Don't do this, causes "Bizarre copy of HASH
519                                #       in scalar assignment in error_diag
520                                #       RT121127
521                                # auto_diag => 1,
522                                auto_diag => 0,
523                                # Don't do this, it causes "Attempt to free unreferenced scalar"
524                                # callbacks => {
525                                        # after_parse => sub {
526                                                # my ($csv, @rows) = @_;
527                                                # my @rc;
528                                                # foreach my $row(@rows) {
529                                                        # if($row->[0] !~ /^#/) {
530                                                                # push @rc, $row;
531                                                        # }
532                                                # }
533                                                # return @rc;
534                                        # }
535                                # }
536                        };
537
538                        # my %options = (
539                                # allow_loose_quotes => 1,
540                                # blank_is_undef => 1,
541                                # empty_is_undef => 1,
542                                # binary => 1,
543                                # f_file => $slurp_file,
544                                # escape_char => '\\',
545                                # sep_char => $sep_char,
546                        # );
547
548                        # $dbh->{csv_tables}->{$table} = \%options;
549                        # delete $options{f_file};
550
551                        # require Text::CSV::Slurp;
552                        # Text::CSV::Slurp->import();
553                        # $self->{'data'} = Text::CSV::Slurp->load(file => $slurp_file, %options);
554
555                        # Can't slurp when we want to use our own column names as Text::xSV::Slurp has no way to override the names
556                        # FIXME: Text::xSV::Slurp can't cope well with quotes in field contents
557
14
11128
                        if(((-s $slurp_file) <= $max_slurp_size) && !$params->{'column_names'}) {
558
12
49
                                if((-s $slurp_file) == 0) {
559                                        # Empty file
560
0
0
                                        $self->{'data'} = ();
561                                } else {
562
12
1597
                                        require Text::xSV::Slurp;
563
12
24477
                                        Text::xSV::Slurp->import();
564
565
12
25
                                        $self->_debug('slurp in');
566
567
12
633
                                        my $dataref = xsv_slurp(
568                                                shape => 'aoh',
569                                                text_csv => {
570                                                        sep_char => $sep_char,
571                                                        allow_loose_quotes => 1,
572                                                        blank_is_undef => 1,
573                                                        empty_is_undef => 1,
574                                                        binary => 1,
575                                                        escape_char => '\\',
576                                                },
577                                                # string => \join('', grep(!/^\s*(#|$)/, <DATA>))
578                                                file => $slurp_file
579                                        );
580
581                                        # Ignore blank lines or lines starting with # in the CSV file
582
12
39
60
12
2170
46
56
13
                                        my @data = grep { $_->{$self->{'id'}} !~ /^\s*#/ } grep { defined($_->{$self->{'id'}}) } @{$dataref};
583
584
12
18
                                        if($self->{'no_entry'}) {
585                                                # Not keyed, will need to scan each entry
586
3
8
                                                $self->{'data'} = @data;
587                                        } else {
588                                                # keyed on the $self->{'id'} (default: "entry") column
589                                                # while(my $d = shift @data) {
590                                                        # $self->{'data'}->{$d->{$self->{'id'}}} = $d;
591                                                # }
592                                                # Build hash directly from the filtered array, better to use map to avoid data copy
593                                                # and enclose in { } to ensure it's a hash ref
594
9
34
8
46
                                                $self->{'data'} = { map { $_->{$self->{'id'}} => $_ } @data };
595                                        }
596                                }
597                        }
598
14
33
                        $self->{'type'} = 'CSV';
599                } else {
600
3
18
                        $slurp_file = File::Spec->catfile($dir, "$dbname.xml");
601
3
12
                        if(-r $slurp_file) {
602
2
6
                                if((-s $slurp_file) <= $max_slurp_size) {
603
1
345
                                        require XML::Simple;
604
1
3742
                                        XML::Simple->import();
605
606
1
26
                                        my $xml = XMLin($slurp_file);
607
1
1
18315
2
                                        my @keys = keys %{$xml};
608
1
1
                                        my $key = $keys[0];
609
1
1
                                        my @data;
610
1
2
                                        if(ref($xml->{$key}) eq 'ARRAY') {
611
1
1
1
2
                                                @data = @{$xml->{$key}};
612                                        } elsif(ref($xml) eq 'ARRAY') {
613
0
0
0
0
                                                @data = @{$xml};
614                                        } elsif((ref($xml) eq 'HASH') && !$self->{'no_entry'}) {
615
0
0
0
0
                                                if(scalar(keys %{$xml}) == 1) {
616
0
0
                                                        if($xml->{$table}) {
617
0
0
                                                                @data = $xml->{$table};
618                                                        } else {
619
0
0
                                                                die 'TODO: import arbitrary XML with "entry" field';
620                                                        }
621                                                } else {
622
0
0
                                                        die 'TODO: import arbitrary XML (differnt number of keys)';
623                                                }
624                                        } else {
625
0
0
                                                die 'TODO: import arbitrary XML, cannot currently handle ', ref($xml);
626                                        }
627
1
8
                                        $self->{'data'} = ();
628
1
2
                                        if($self->{'no_entry'}) {
629                                                # Not keyed, will need to scan each entry
630
0
0
                                                my $i = 0;
631
0
0
                                                foreach my $d(@data) {
632
0
0
                                                        $self->{'data'}->{$i++} = $d;
633                                                }
634                                        } else {
635                                                # keyed on the $self->{'id'} (default: "entry") column
636
1
1
                                                foreach my $d(@data) {
637
5
6
                                                        $self->{'data'}->{$d->{$self->{'id'}}} = $d;
638                                                }
639                                        }
640                                } else {
641
1
3
                                        $dbh = DBI->connect('dbi:XMLSimple(RaiseError=>1):');
642
1
85778
                                        $dbh->{'RaiseError'} = 1;
643
1
8
                                        $self->_debug("read in $table from XML $slurp_file");
644
1
17
                                        $dbh->func($table, 'XML', $slurp_file, 'xmlsimple_import');
645                                }
646                        } else {
647                                # throw Error(-file => "$dir/$table");
648
1
4
                                $self->_fatal("Can't find a file called '$dbname' for the table $table in $dir");
649                        }
650
2
8
                        $self->{'type'} = 'XML';
651                }
652        }
653
654
16
0
77
0
        Data::Reuse::fixate(%{$self->{'data'}}) if($self->{'data'} && (ref($self->{'data'} eq 'HASH')));
655
656
16
30
        $self->{$table} = $dbh;
657
16
96
        my @statb = stat($slurp_file);
658
16
23
        $self->{'_updated'} = $statb[9];
659
660
16
28
        return $self;
661}
662
663 - 675
=head2  selectall_arrayref

Returns a reference to an array of hash references of all the data meeting
the given criteria.

Note that since this returns an array ref,
optimisations such as "LIMIT 1" will not be used.

Use caching if that is available.

Returns undef if there are no matches.

=cut
676
677sub selectall_arrayref {
678
8
641
        my $self = shift;
679
8
5
        my $params;
680
681
8
18
        if($self->{'no_entry'}) {
682
1
3
                $params = Params::Get::get_params(undef, \@_);
683        } elsif(scalar(@_)) {
684
4
6
                $params = Params::Get::get_params('entry', @_);
685        }
686
687
8
54
        if($self->{'berkeley'}) {
688
0
0
                $self->_fatal(ref($self), ': selectall_arrayref is meaningless on a NoSQL database');
689        }
690
691
8
13
        my $table = $self->_open_table($params);
692
693
8
12
        if($self->{'data'}) {
694
4
4
2
5
                if(scalar(keys %{$params}) == 0) {
695
2
6
                        $self->_trace("$table: selectall_arrayref fast track return");
696
2
29
                        if(ref($self->{'data'}) eq 'HASH') {
697                                # $self->{'data'} looks like this:
698                                #       key1 => {
699                                #               entry => key1,
700                                #               field1 => value1,
701                                #               field2 => value2
702                                #       }, key2 => {
703                                #               entry => key2,
704                                #               field1 => valuea,
705                                #               field2 => valueb
706                                #       }
707
2
2
2
5
                                $self->_debug("$table: returning ", scalar keys %{$self->{'data'}}, ' entries');
708
2
2
22
2
                                if(scalar keys %{$self->{'data'}} <= 10) {
709
2
6
                                        $self->_debug(Dumper($self->{'data'}));
710                                }
711
2
21
                                my @rc;
712
2
2
2
3
                                foreach my $k (keys %{$self->{'data'}}) {
713
9
8
                                        push @rc, $self->{'data'}->{$k};
714                                }
715
2
4
                                return Return::Set::set_return(\@rc, { type => 'arrayref' });
716                        }
717
0
0
                        return Return::Set::set_return($self->{'data'}, { type => 'arrayref'});
718                        # my @rc = values %{$self->{'data'}};
719                        # return @rc;
720
2
4
                } elsif((scalar(keys %{$params}) == 1) && defined($params->{'entry'}) && !$self->{'no_entry'}) {
721
1
3
                        return Return::Set::set_return([$self->{'data'}->{$params->{'entry'}}], { type => 'arrayref' });
722                }
723        }
724
725
5
4
        my $query;
726
5
3
        my $done_where = 0;
727
728
5
14
        if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
729
3
4
                $query = "SELECT * FROM $table WHERE entry IS NOT NULL AND entry NOT LIKE '#%'";
730
3
3
                $done_where = 1;
731        } else {
732
2
3
                $query = "SELECT * FROM $table";
733        }
734
735
5
2
        my @query_args;
736
5
5
4
9
        foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
737
3
2
                my $arg = $params->{$c1};
738
3
4
                if(ref($arg)) {
739                        # throw Error::Simple("$query: argument is not a string: " . ref($arg));
740
0
0
                        $self->_fatal("selectall_arrayref(): $query: argument is not a string: ", ref($arg));
741                }
742
3
3
                if(!defined($arg)) {
743
0
0
                        my @call_details = caller(0);
744                        # throw Error::Simple("$query: value for $c1 is not defined in call from " .
745                                # $call_details[2] . ' of ' . $call_details[1]);
746
0
0
                        Carp::croak("$query: value for $c1 is not defined in call from ",
747                                $call_details[2], ' of ', $call_details[1]);
748                }
749
750
3
1
                my $keyword;
751
3
4
                if($done_where) {
752
3
2
                        $keyword = 'AND';
753                } else {
754
0
0
                        $keyword = 'WHERE';
755
0
0
                        $done_where = 1;
756                }
757
3
4
                if($arg =~ /[%_]/) {
758
0
0
                        $query .= " $keyword $c1 LIKE ?";
759                } else {
760
3
3
                        $query .= " $keyword $c1 = ?";
761                }
762
3
3
                push @query_args, $arg;
763        }
764
5
8
        if(!$self->{no_entry}) {
765
4
4
                $query .= ' ORDER BY ' . $self->{'id'};
766        }
767
768
5
13
        if(defined($query_args[0])) {
769
3
5
                $self->_debug("selectall_arrayref $query: ", join(', ', @query_args));
770        } else {
771
2
4
                $self->_debug("selectall_arrayref $query");
772        }
773
774
5
95
        my $key;
775        my $c;
776
5
7
        if($c = $self->{cache}) {
777
0
0
                $key = ref($self) . "::$query array";
778
0
0
                if(defined($query_args[0])) {
779
0
0
                        $key .= ' ' . join(', ', @query_args);
780                }
781
0
0
                if(my $rc = $c->get($key)) {
782
0
0
                        $self->_debug('cache HIT');
783
0
0
                        return $rc;     # We stored a ref to the array
784
785                        # This use of a temporary variable is to avoid
786                        #       "Implicit scalar context for array in return"
787                        # my @rc = @{$rc};
788                        # return @rc;
789                }
790
0
0
                $self->_debug('cache MISS');
791        } else {
792
5
5
                $self->_debug('cache not used');
793        }
794
795
5
65
        if(my $sth = $self->{$table}->prepare($query)) {
796
5
5800
                $sth->execute(@query_args) ||
797                        # throw Error::Simple("$query: @query_args");
798                        croak("$query: @query_args");
799
800
5
9031
                my $rc;
801
5
33
                while(my $href = $sth->fetchrow_hashref()) {
802
27
27
27
678
53
28
                        push @{$rc}, $href if(scalar keys %{$href});
803                }
804
5
110
                if($c) {
805
0
0
                        $c->set($key, $rc, $self->{'cache_duration'});    # Store a ref to the array
806                }
807
808
5
5
8
10
                Data::Reuse::fixate(@{$rc}) if(!$self->{'no_fixate'});
809
810
5
856
                return $rc;
811        }
812
0
0
        $self->_warn("selectall_array failure on $query: @query_args");
813        # throw Error::Simple("$query: @query_args");
814
0
0
        croak("$query: @query_args");
815
816        # my @rc = grep { defined $_ } $self->selectall_array(@_);
817
818        # return if(scalar(@rc) == 0);
819
820        # Data::Reuse::fixate(@rc) if(!$self->{'no_fixate'});
821        # return \@rc;
822}
823
824 - 828
=head2  selectall_hashref

Deprecated misleading legacy name for selectall_arrayref.

=cut
829
830sub selectall_hashref
831{
832
6
1145
        my $self = shift;
833
6
11
        return $self->selectall_arrayref(@_);
834}
835
836 - 845
=head2  selectall_array

Similar to selectall_array but returns an array of hash references.

Con:    Copies more data around than selectall_arrayref
Pro:    Better determination of list vs scalar mode than selectall_arrayref by setting "LIMIT 1"

TODO:   Remove duplicated code

=cut
846
847sub selectall_array
848{
849
7
5
        my $self = shift;
850
851
7
12
        if($self->{'berkeley'}) {
852
0
0
                Carp::croak(ref($self), ': selectall_array is meaningless on a NoSQL database');
853        }
854
855
7
13
        my $params = Params::Get::get_params(undef, \@_);
856
7
72
        my $table = $self->_open_table($params);
857
858
7
12
        if($self->{'data'}) {
859
4
4
2
5
                if(scalar(keys %{$params}) == 0) {
860
3
6
                        $self->_trace("$table: selectall_array fast track return");
861
3
40
                        if(ref($self->{'data'}) eq 'HASH') {
862
3
3
3
7
                                return values %{$self->{'data'}};
863                        }
864
0
0
0
0
                        return @{$self->{'data'}};
865                        # my @rc = values %{$self->{'data'}};
866                        # return @rc;
867
1
4
                } elsif((scalar(keys %{$params}) == 1) && defined($params->{'entry'}) && !$self->{'no_entry'}) {
868
0
0
                        return $self->{'data'}->{$params->{'entry'}};
869                }
870        }
871
872
4
1
        my $query;
873
4
4
        my $done_where = 0;
874
875
4
11
        if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
876
2
2
                $query = "SELECT * FROM $table WHERE entry IS NOT NULL AND entry NOT LIKE '#%'";
877
2
1
                $done_where = 1;
878        } else {
879
2
1
                $query = "SELECT * FROM $table";
880        }
881
882
4
4
        my @query_args;
883
4
4
2
9
        foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
884
3
3
                my $arg = $params->{$c1};
885
3
4
                if(ref($arg)) {
886                        # throw Error::Simple("$query: argument is not a string: " . ref($arg));
887
0
0
                        $self->_fatal("selectall_array(): $query: argument is not a string: ", ref($arg));
888                }
889
890
3
2
                my $keyword;
891
3
4
                if($done_where) {
892
2
11
                        $keyword = 'AND';
893                } else {
894
1
1
                        $keyword = 'WHERE';
895
1
1
                        $done_where = 1;
896                }
897
3
3
                if(!defined($arg)) {
898
1
2
                        $query .= " $keyword $c1 IS NULL"
899                } else {
900
2
4
                        if($arg =~ /[%_]/) {
901
0
0
                                $query .= " $keyword $c1 LIKE ?";
902                        } else {
903
2
2
                                $query .= " $keyword $c1 = ?";
904                        }
905
2
2
                        push @query_args, $arg;
906                }
907        }
908
4
8
        if(!$self->{no_entry}) {
909
3
4
                $query .= ' ORDER BY ' . $self->{'id'};
910        }
911
4
5
        if(!wantarray) {
912
0
0
                $query .= ' LIMIT 1';
913        }
914
915
4
4
        if(defined($query_args[0])) {
916
2
6
                $self->_debug("selectall_array $query: ", join(', ', @query_args));
917        } else {
918
2
4
                $self->_debug("selectall_array $query");
919        }
920
921
4
64
        my $key;
922        my $c;
923
4
19
        if($c = $self->{cache}) {
924
0
0
                $key = ref($self) . '::' . $query;
925
0
0
                if(wantarray) {
926
0
0
                        $key .= ' array';
927                }
928
0
0
                if(defined($query_args[0])) {
929
0
0
                        $key .= ' ' . join(', ', @query_args);
930                }
931
0
0
                if(my $rc = $c->get($key)) {
932
0
0
                        $self->_debug('cache HIT');
933
0
0
0
0
                        return wantarray ? @{$rc} : $rc;        # We stored a ref to the array
934
935                        # This use of a temporary variable is to avoid
936                        #       "Implicit scalar context for array in return"
937                        # my @rc = @{$rc};
938                        # return @rc;
939                }
940
0
0
                $self->_debug('cache MISS');
941        } else {
942
4
7
                $self->_debug('cache not used');
943        }
944
945
4
67
        if(my $sth = $self->{$table}->prepare($query)) {
946
4
4316
                $sth->execute(@query_args) ||
947                        # throw Error::Simple("$query: @query_args");
948                        croak("$query: @query_args");
949
950
4
7042
                my $rc;
951
4
18
                while(my $href = $sth->fetchrow_hashref()) {
952
22
527
                        return $href if(!wantarray);    # FIXME: Doesn't store in the cache
953
22
22
13
42
                        push @{$rc}, $href;
954                }
955
4
78
                if($c) {
956
0
0
                        $c->set($key, $rc, $self->{'cache_duration'});    # Store a ref to the array
957                }
958
959
4
7
                if($rc) {
960
4
4
8
8
                        Data::Reuse::fixate(@{$rc}) if(!$self->{'no_fixate'});
961
4
4
675
18
                        return @{$rc};
962                }
963
0
0
                return;
964        }
965
0
0
        $self->_warn("selectall_array failure on $query: @query_args");
966        # throw Error::Simple("$query: @query_args");
967
0
0
        croak("$query: @query_args");
968}
969
970 - 974
=head2  selectall_hash

Deprecated misleading legacy name for selectall_array.

=cut
975
976sub selectall_hash
977{
978
7
1618
        my $self = shift;
979
7
19
        return $self->selectall_array(@_);
980}
981
982 - 986
=head2  count

Return the number items/rows matching the given criteria

=cut
987
988sub count
989{
990
5
745
        my $self = shift;
991
992
5
8
        if($self->{'berkeley'}) {
993
0
0
                Carp::croak(ref($self), ': count is meaningless on a NoSQL database');
994        }
995
996
5
10
        my $params = Params::Get::get_params(undef, \@_);
997
5
56
        my $table = $self->_open_table($params);
998
999
5
7
        if($self->{'data'}) {
1000
3
3
3
3
                if(scalar(keys %{$params}) == 0) {
1001
2
4
                        $self->_trace("$table: count fast track return");
1002
2
24
                        if(ref($self->{'data'}) eq 'HASH') {
1003
2
2
2
6
                                return scalar keys %{$self->{'data'}};
1004                        }
1005
0
0
0
0
                        return scalar @{$self->{'data'}};
1006
1
3
                } elsif((scalar(keys %{$params}) == 1) && defined($params->{'entry'}) && !$self->{'no_entry'}) {
1007
0
0
                        return $self->{'data'}->{$params->{'entry'}} ? 1 : 0;
1008                }
1009        }
1010
1011
3
3
        my $query;
1012
3
2
        my $done_where = 0;
1013
1014
3
10
        if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1015
1
1
                $query = "SELECT COUNT(*) FROM $table WHERE entry IS NOT NULL AND entry NOT LIKE '#%'";
1016
1
1
                $done_where = 1;
1017        } elsif($self->{no_entry}) {
1018
2
2
                $query = "SELECT COUNT(*) FROM $table";
1019        } else {
1020
0
0
                $query = "SELECT COUNT(entry) FROM $table";
1021        }
1022
1023
3
3
        my @query_args;
1024
3
3
3
5
        foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
1025
1
1
                my $arg = $params->{$c1};
1026
1
1
                if(ref($arg)) {
1027                        # throw Error::Simple("$query: argument is not a string: " . ref($arg));
1028
0
0
                        $self->_fatal("count(): $query: argument is not a string: ", ref($arg));
1029                }
1030
1
2
                if(!defined($arg)) {
1031
0
0
                        my @call_details = caller(0);
1032                        # throw Error::Simple("$query: value for $c1 is not defined in call from " .
1033                                # $call_details[2] . ' of ' . $call_details[1]);
1034
0
0
                        Carp::croak("$query: value for $c1 is not defined in call from ",
1035                                $call_details[2], ' of ', $call_details[1]);
1036                }
1037
1038
1
0
                my $keyword;
1039
1
1
                if($done_where) {
1040
1
2
                        $keyword = 'AND';
1041                } else {
1042
0
0
                        $keyword = 'WHERE';
1043
0
0
                        $done_where = 1;
1044                }
1045
1
1
                if($arg =~ /[%_]/) {
1046
0
0
                        $query .= " $keyword $c1 LIKE ?";
1047                } else {
1048
1
5
                        $query .= " $keyword $c1 = ?";
1049                }
1050
1
2
                push @query_args, $arg;
1051        }
1052
3
5
        if(!$self->{no_entry}) {
1053
1
2
                $query .= ' ORDER BY ' . $self->{'id'};
1054        }
1055
1056
3
3
        if(defined($query_args[0])) {
1057
1
2
                $self->_debug("count $query: ", join(', ', @query_args));
1058        } else {
1059
2
4
                $self->_debug("count $query");
1060        }
1061
1062
3
43
        my $key;
1063        my $c;
1064
3
4
        if($c = $self->{cache}) {
1065
0
0
                $key = ref($self) . '::' . $query;
1066
0
0
                $key =~ s/COUNT\((.+?)\)/$1/;
1067
0
0
                $key .= ' array';
1068
0
0
                if(defined($query_args[0])) {
1069
0
0
                        $key .= ' ' . join(', ', @query_args);
1070                }
1071
0
0
                if(my $rc = $c->get($key)) {
1072                        # Unlikely
1073
0
0
                        $self->_debug('cache HIT');
1074
0
0
0
0
                        return scalar @{$rc};   # We stored a ref to the array
1075                }
1076
0
0
                $self->_debug('cache MISS');
1077        } else {
1078
3
3
                $self->_debug('cache not used');
1079        }
1080
1081
3
42
        if(my $sth = $self->{$table}->prepare($query)) {
1082
3
1755
                $sth->execute(@query_args) ||
1083                        # throw Error::Simple("$query: @query_args");
1084                        croak("$query: @query_args");
1085
1086
3
3779
                my $count = $sth->fetchrow_arrayref()->[0];
1087
1088
3
64
                $sth->finish();
1089
1090
3
16
                return $count;
1091        }
1092
0
0
        $self->_warn("count failure on $query: @query_args");
1093        # throw Error::Simple("$query: @query_args");
1094
0
0
        croak("$query: @query_args");
1095}
1096
1097 - 1110
=head2  fetchrow_hashref

Returns a hash reference for a single row in a table.

It searches for the given arguments, searching IS NULL if the value is C<undef>

   my $res = $foo->fetchrow_hashref(entry => 'one');

Special argument: table: determines the table to read from if not the default,
which is worked out from the class name

When no_entry is not set allow just one argument to be given: the entry value.

=cut
1111
1112sub fetchrow_hashref {
1113
8
1134
        my $self = shift;
1114
1115
8
20
        $self->_trace('Entering fetchrow_hashref');
1116
1117
8
106
        my $params;
1118
1119
8
12
        if(!$self->{'no_entry'}) {
1120
7
9
                $params = Params::Get::get_params('entry', @_);
1121        } else {
1122
1
4
                $params = Params::Get::get_params(undef, @_);
1123        }
1124
1125
8
90
        my $table = $self->_open_table($params);
1126
1127        # ::diag($self->{'type'});
1128
8
7
18
16
        if($self->{'data'} && (!$self->{'no_entry'}) && (scalar keys(%{$params}) == 1) && defined($params->{'entry'})) {
1129
4
5
                $self->_debug('Fast return from slurped data');
1130
4
64
                return $self->{'data'}->{$params->{'entry'}};
1131        }
1132
1133
4
7
        if($self->{'berkeley'}) {
1134                # print STDERR ">>>>>>>>>>>>\n";
1135                # ::diag(Data::Dumper->new([$self->{'berkeley'}])->Dump());
1136
0
0
0
0
                if((!$self->{'no_entry'}) && (scalar keys(%{$params}) == 1) && defined($params->{'entry'})) {
1137
0
0
                        return { entry => $self->{'berkeley'}->{$params->{'entry'}} };
1138                }
1139
0
0
                my $id = $self->{'id'};
1140
0
0
0
0
                if($self->{'no_entry'} && (scalar keys(%{$params}) == 1) && defined($id) && defined($params->{$id})) {
1141
0
0
                        if(my $rc = $self->{'berkeley'}->{$params->{$id}}) {
1142
0
0
                                return { $params->{$id} => $rc }  # Return key->value as a hash pair
1143                        }
1144
0
0
                        return;
1145                }
1146
0
0
                Carp::croak(ref($self), ': fetchrow_hashref is meaningless on a NoSQL database');
1147        }
1148
1149
4
5
        my $query = 'SELECT * FROM ';
1150
4
6
        if(my $t = delete $params->{'table'}) {
1151
0
0
                $query .= $t;
1152        } else {
1153
4
5
                $query .= $table;
1154        }
1155
4
3
        my $done_where = 0;
1156
1157
4
12
        if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1158
3
5
                $query .= ' WHERE ' . $self->{'id'} . ' IS NOT NULL AND ' . $self->{'id'} . " NOT LIKE '#%'";
1159
3
2
                $done_where = 1;
1160        }
1161
4
4
        my @query_args;
1162
4
4
2
6
        foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
1163
4
3
                my $keyword;
1164
1165
4
5
                if($done_where) {
1166
3
1
                        $keyword = 'AND';
1167                } else {
1168
1
1
                        $keyword = 'WHERE';
1169
1
0
                        $done_where = 1;
1170                }
1171
4
7
                if(my $arg = $params->{$c1}) {
1172
3
2
                        if(ref($arg)) {
1173                                # throw Error::Simple("$query: argument is not a string: " . ref($arg));
1174
0
0
                                $self->_fatal("fetchrow_hash(): $query: argument is not a string: ", ref($arg));
1175                        }
1176
1177
3
6
                        if($arg =~ /[%_]/) {
1178
0
0
                                $query .= " $keyword $c1 LIKE ?";
1179                        } else {
1180
3
3
                                $query .= " $keyword $c1 = ?";
1181                        }
1182
3
5
                        push @query_args, $arg;
1183                } else {
1184
1
2
                        $query .= " $keyword $c1 IS NULL";
1185                        # my @call_details = caller(0);
1186                        # # throw Error::Simple("$query: value for $c1 is not defined in call from " .
1187                                # # $call_details[2] . ' of ' . $call_details[1]);
1188                        # Carp::croak("$query: value for $c1 is not defined in call from ",
1189                                # $call_details[2], ' of ', $call_details[1]);
1190                }
1191        }
1192        # $query .= ' ORDER BY entry LIMIT 1';
1193
4
5
        $query .= ' LIMIT 1';
1194
4
6
        if(defined($query_args[0])) {
1195
3
5
                my @call_details = caller(0);
1196
3
37
                $self->_debug("fetchrow_hashref $query: ", join(', ', @query_args),
1197                        ' called from ', $call_details[2], ' of ', $call_details[1]);
1198        } else {
1199
1
2
                $self->_debug("fetchrow_hashref $query");
1200        }
1201
4
56
        my $key = ref($self) . '::';
1202
4
6
        if(defined($query_args[0])) {
1203
3
5
                if(wantarray) {
1204
0
0
                        $key .= 'array ';
1205                }
1206
3
6
                $key .= "fetchrow $query " . join(', ', @query_args);
1207        } else {
1208
1
1
                $key .= "fetchrow $query";
1209        }
1210
4
4
        my $c;
1211
4
5
        if($c = $self->{cache}) {
1212
0
0
                if(my $rc = $c->get($key)) {
1213
0
0
                        if(wantarray) {
1214
0
0
                                if(ref($rc) eq 'ARRAY') {
1215
0
0
0
0
                                        return @{$rc};  # We stored a ref to the array
1216                                }
1217                        } else {
1218
0
0
                                return $rc;
1219                        }
1220                }
1221        }
1222
1223
4
23
        my $sth = $self->{$table}->prepare($query) or die $self->{$table}->errstr();
1224        # $sth->execute(@query_args) || throw Error::Simple("$query: @query_args");
1225
4
10655
        $sth->execute(@query_args) || croak("$query: @query_args");
1226
4
4648
        my $rc = $sth->fetchrow_hashref();
1227
4
162
        if($c) {
1228
0
0
                if($rc) {
1229
0
0
                        $self->_debug("stash $key=>$rc in the cache for ", $self->{'cache_duration'});
1230
0
0
                        $self->_debug("returns ", Data::Dumper->new([$rc])->Dump());
1231                } else {
1232
0
0
                        $self->_debug("Stash $key=>undef in the cache for ", $self->{'cache_duration'});
1233                }
1234
0
0
                $c->set($key, $rc, $self->{'cache_duration'});
1235        }
1236
4
24
        return $rc;
1237}
1238
1239 - 1252
=head2  execute

Execute the given SQL query on the database.
In an array context, returns an array of hash refs,
in a scalar context returns a hash of the first row

On CSV tables without no_entry, it may help to add
"WHERE entry IS NOT NULL AND entry NOT LIKE '#%'"
to the query.

If the data have been slurped,
this will still work by accessing that actual database.

=cut
1253
1254sub execute
1255{
1256
2
925
        my $self = shift;
1257
1258
2
4
        if($self->{'berkeley'}) {
1259
0
0
                Carp::croak(ref($self), ': execute is meaningless on a NoSQL database');
1260        }
1261
1262
2
7
        my $args = Params::Get::get_params('query', @_);
1263        # Ensure the 'query' parameter is provided
1264        Carp::croak(__PACKAGE__, ': Usage: execute(query => $query)')
1265
2
21
                unless defined $args->{'query'};
1266
1267
2
3
        my $table = $self->_open_table($args);
1268
1269
2
3
        my $query = $args->{'query'};
1270
1271        # Append "FROM <table>" if missing
1272
2
11
        $query .= " FROM $table" unless $query =~ /\sFROM\s/i;
1273
1274        # Log the query if a logger is available
1275
2
5
        $self->_debug("execute $query");
1276
1277        # Prepare and execute the query
1278
2
32
        my $sth = $self->{$table}->prepare($query);
1279
2
1287
        $sth->execute() or croak($query);    # Die with the query in case of error
1280
1281        # Fetch the results
1282
2
1732
        my @results;
1283
2
22
        while (my $row = $sth->fetchrow_hashref()) {
1284                # Return a single hashref if scalar context is expected
1285
6
150
                return $row unless wantarray;
1286
6
13
                push @results, $row;
1287        }
1288
1289        # Return all rows as an array in list context
1290
2
41
        return @results;
1291}
1292
1293 - 1297
=head2 updated

Returns the timestamp of the last database update.

=cut
1298
1299sub updated {
1300
1
384
        my $self = shift;
1301
1302
1
2
        return $self->{'_updated'};
1303}
1304
1305 - 1326
=head2 AUTOLOAD

Directly access a database column.

Returns all entries in a column, a single entry based on criteria.
Uses cached data if available.

Returns an array of the matches,
or only the first when called in scalar context

If the database has a column called "entry" you can do a quick lookup with

    my $value = $foo->column('123'); # where "column" is the value you're after

    my @entries = $foo->entry();
    print 'There are ', scalar(@entries), " entries in the database\n";

Set distinct or unique to 1 if you're after a unique list.

Throws an error in slurp mode when an invalid column name is given.

=cut
1327
1328sub AUTOLOAD {
1329
30
6167
        our $AUTOLOAD;
1330
30
96
        my ($column) = $AUTOLOAD =~ /::(\w+)$/;
1331
1332
30
43
        return if($column eq 'DESTROY');
1333
1334
30
44
        my $self = shift or return;
1335
1336
30
30
        Carp::croak(__PACKAGE__, ": Unknown column $column") if(!ref($self));
1337
1338        # Allow the AUTOLOAD feature to be disabled
1339
30
42
        Carp::croak(__PACKAGE__, ": Unknown column $column") if(exists($self->{'auto_load'}) && boolean($self->{'auto_load'})->isFalse());
1340
1341        # Validate column name - only allow safe column name
1342
30
54
        Carp::croak(__PACKAGE__, ": Invalid column name: $column") unless $column =~ /^[a-zA-Z_][a-zA-Z0-9_]*$/;
1343
1344
30
44
        my $table = $self->_open_table();
1345
1346
29
24
        my %params;
1347
29
62
        if(ref($_[0]) eq 'HASH') {
1348
2
2
2
4
                %params = %{$_[0]};
1349        } elsif((scalar(@_) % 2) == 0) {
1350
11
14
                %params = @_;
1351        } elsif(scalar(@_) == 1) {
1352                # Don't error on key-value databases, since there's no idea of columns
1353
16
21
                if($self->{'no_entry'} && !$self->{'berkeley'}) {
1354
0
0
                        Carp::croak(ref($self), "::($_[0]): ", $self->{'id'}, ' is not a column');
1355                }
1356
16
20
                $params{'entry'} = shift;
1357        }
1358
1359
29
34
        if($self->{'berkeley'}) {
1360
0
0
                if(my $id = $self->{'id'}) {
1361
0
0
                        return $self->{'berkeley'}->{$params{$id}};
1362                }
1363
0
0
                return $self->{'berkeley'}->{$params{'entry'}};
1364        }
1365
1366
29
31
        croak('Where did the data come from?') if(!defined($self->{'type'}));
1367
29
16
        my $query;
1368
29
21
        my $done_where = 0;
1369
29
53
        my $distinct = delete($params{'distinct'}) || delete($params{'unique'});
1370
1371
29
45
        if(wantarray && !$distinct) {
1372
5
12
                if(((scalar keys %params) == 0) && (my $data = $self->{'data'})) {
1373                        # Return all the entries in the column
1374
3
12
3
3
15
6
                        return map { $_->{$column} } values %{$data};
1375                }
1376
2
6
                if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1377
1
1
                        $query = "SELECT $column FROM $table WHERE " . $self->{'id'} . " IS NOT NULL AND entry NOT LIKE '#%'";
1378
1
1
                        $done_where = 1;
1379                } else {
1380
1
1
                        $query = "SELECT $column FROM $table";
1381                }
1382        } else {
1383
24
28
                if(my $data = $self->{'data'}) {
1384                        # The data has been read in using Text::xSV::Slurp,
1385                        #       so no need to do any SQL
1386
18
22
                        $self->_debug('AUTOLOAD using slurped data');
1387
18
348
                        if($self->{'no_entry'}) {
1388
0
0
                                $self->_debug('no_entry is set');
1389
0
0
                                my ($key, $value) = %params;
1390
0
0
                                if(defined($key)) {
1391
0
0
                                        $self->_debug("key = $key, value = $value, column = $column");
1392
0
0
0
0
                                        foreach my $row(@{$data}) {
1393
0
0
                                                if(defined($row->{$key}) && ($row->{$key} eq $value) && (my $rc = $row->{$column})) {
1394
0
0
                                                        if(defined($rc)) {
1395
0
0
                                                                $self->_trace(__LINE__, ": AUTOLOAD $key: return '$rc' from slurped data");
1396                                                        } else {
1397
0
0
                                                                $self->_trace(__LINE__, ": AUTOLOAD $key: return undef from slurped data");
1398                                                        }
1399
0
0
                                                        return $rc
1400                                                }
1401                                        }
1402
0
0
                                        $self->_debug('not found in slurped data');
1403                                }
1404                        } elsif(((scalar keys %params) == 1) && defined(my $key = $params{'entry'})) {
1405                                # Look up the key
1406
1407                                # This weird code is to stop the data hash becoming polluted with empty
1408                                #       values as we look things up
1409                                # my $rc = $data->{$key}->{$column};
1410
14
10
                                my $rc;
1411
14
16
                                if(defined(my $hash = $data->{$key})) {
1412                                        # Look up the key
1413
10
10
                                        if(!exists($hash->{$column})) {
1414
1
7
                                                Carp::croak(__PACKAGE__, ": There is no column $column in $table");
1415                                        }
1416
9
9
                                        $rc = $hash->{$column};
1417                                }
1418
13
12
                                if(defined($rc)) {
1419
8
39
                                        $self->_trace(__LINE__, ": AUTOLOAD $key: return '$rc' from slurped data");
1420                                } else {
1421
5
11
                                        $self->_trace(__LINE__, ": AUTOLOAD $key: return undef from slurped data");
1422                                }
1423
13
211
                                return $rc
1424                        } elsif((scalar keys %params) == 0) {
1425
2
2
                                if(wantarray) {
1426
2
4
                                        if($distinct) {
1427                                                # https://stackoverflow.com/questions/7651/how-do-i-remove-duplicate-items-from-an-array-in-perl
1428
2
7
7
2
2
6
7
2
                                                my %h = map { $_, 1 } map { $_->{$column} } values %{$data};
1429
2
6
                                                return keys %h;
1430                                        }
1431
0
0
0
0
0
0
                                        return map { $_->{$column} } values %{$data}
1432                                }
1433                                # FIXME - this works but really isn't the right way to do it
1434
0
0
0
0
                                foreach my $v (values %{$data}) {
1435
0
0
                                        return $v->{$column}
1436                                }
1437                        } else {
1438                                # It's keyed, but we're not querying off it
1439
2
3
                                my ($key, $value) = %params;
1440
2
2
3
2
                                foreach my $row (values %{$data}) {
1441
7
17
                                        if(defined($row->{$key}) && ($row->{$key} eq $value) && (my $rc = $row->{$column})) {
1442
2
2
                                                if(defined($rc)) {
1443
2
4
                                                        $self->_trace(__LINE__, ": AUTOLOAD $key: return '$rc' from slurped data");
1444                                                } else {
1445
0
0
                                                        $self->_trace(__LINE__, ": AUTOLOAD $key: return undef from slurped data");
1446                                                }
1447
2
43
                                                return $rc
1448                                        }
1449                                }
1450                        }
1451                        return
1452
0
0
                }
1453                # Data has not been slurped in
1454
6
14
                if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1455
0
0
                        $query = "SELECT DISTINCT $column FROM $table WHERE " . $self->{'id'} . " IS NOT NULL AND entry NOT LIKE '#%'";
1456
0
0
                        $done_where = 1;
1457                } else {
1458
6
7
                        $query = "SELECT DISTINCT $column FROM $table";
1459                }
1460        }
1461
8
6
        my @args;
1462
8
14
        while(my ($key, $value) = each %params) {
1463
6
14
                $self->_debug(__PACKAGE__, ": AUTOLOAD adding key/value pair $key=>$value");
1464
6
361
                if(defined($value)) {
1465
6
7
                        if($done_where) {
1466
1
1
                                $query .= " AND $key = ?";
1467                        } else {
1468
5
5
                                $query .= " WHERE $key = ?";
1469
5
5
                                $done_where = 1;
1470                        }
1471
6
13
                        push @args, $value;
1472                } else {
1473
0
0
                        $self->_debug("AUTOLOAD params $key isn't defined");
1474
0
0
                        if($done_where) {
1475
0
0
                                $query .= " AND $key IS NULL";
1476                        } else {
1477
0
0
                                $query .= " WHERE $key IS NULL";
1478
0
0
                                $done_where = 1;
1479                        }
1480                }
1481        }
1482
8
9
        if(wantarray) {
1483
3
4
                $query .= " ORDER BY $column";
1484        } else {
1485
5
4
                $query .= ' LIMIT 1';
1486        }
1487
8
14
        if(scalar(@args) && $args[0]) {
1488
6
17
                $self->_debug("AUTOLOAD $query: ", join(', ', @args));
1489        } else {
1490
2
3
                $self->_debug("AUTOLOAD $query");
1491        }
1492
8
363
        my $cache;
1493
8
9
        my $key = ref($self) . '::';
1494
8
15
        if($cache = $self->{cache}) {
1495
0
0
                if(wantarray) {
1496
0
0
                        $key .= 'array ';
1497                }
1498
0
0
                if(defined($args[0])) {
1499
0
0
                        $key .= "fetchrow $query " . join(', ', @args);
1500                } else {
1501
0
0
                        $key .= "fetchrow $query";
1502                }
1503
0
0
                if(my $rc = $cache->get($key)) {
1504
0
0
                        $self->_debug('cache HIT');
1505
0
0
0
0
                        return wantarray ? @{$rc} : $rc;        # We stored a ref to the array
1506                }
1507
0
0
                $self->_debug('cache MISS');
1508        } else {
1509
8
10
                $self->_debug('cache not used');
1510        }
1511        # my $sth = $self->{$table}->prepare($query) || throw Error::Simple($query);
1512
8
380
        my $sth = $self->{$table}->prepare($query) || croak($query);
1513        # $sth->execute(@args) || throw Error::Simple($query);
1514
8
13743
        $sth->execute(@args) || croak($query);
1515
1516
8
17925
        if(wantarray) {
1517
3
10
3
3
191
11
                my @rc = map { $_->[0] } @{$sth->fetchall_arrayref()};
1518
3
13
                if($cache) {
1519
0
0
                        $cache->set($key, \@rc, $self->{'cache_duration'});       # Store a ref to the array
1520                }
1521
3
9
                Data::Reuse::fixate(@rc) if(!$self->{'no_fixate'});
1522
3
176
                return @rc;
1523        }
1524
5
27
        my $rc = $sth->fetchrow_array();     # Return the first match only
1525
5
83
        $sth->finish();
1526
5
20
        if($cache) {
1527
0
0
                return $cache->set($key, $rc, $self->{'cache_duration'});
1528        }
1529
5
16
        return $rc;
1530}
1531
1532sub DESTROY
1533{
1534
31
9355
        if(defined($^V) && ($^V ge 'v5.14.0')) {
1535
31
59
                return if ${^GLOBAL_PHASE} eq 'DESTRUCT';       # >= 5.14.0 only
1536        }
1537
31
48
        my $self = shift;
1538
1539        # Clean up temporary file
1540
31
45
        if($self->{'temp'}) {
1541
0
0
                unlink delete $self->{'temp'};
1542        }
1543
1544        # Clean up database handles
1545
31
58
        my $table_name = $self->{'table'} || ref($self);
1546
31
79
        $table_name =~ s/.*:://;
1547
1548
31
45
        if(my $dbh = delete $self->{$table_name}) {
1549
14
108
                $dbh->disconnect() if $dbh->can('disconnect');
1550
14
295
                $dbh->finish() if $dbh->can('finish');
1551        }
1552
1553        # Clean up Berkeley DB
1554
31
42
        if($self->{'berkeley'}) {
1555
0
0
                eval {
1556
0
0
0
0
                        untie %{$self->{'berkeley'}};
1557                };
1558
0
0
                delete $self->{'berkeley'};
1559        }
1560
1561        # Clear all other attributes to break potential circular references
1562
31
46
        foreach my $key (keys %$self) {
1563
291
673
                delete $self->{$key};
1564        }
1565}
1566
1567# Determine the table and open the database
1568sub _open_table
1569{
1570
60
56
        my($self, $params) = @_;
1571
1572        # Get table name (remove package name prefix if present)
1573
60
159
        my $table = $params->{'table'} || $self->{'table'} || ref($self);
1574
60
121
        $table =~ s/.*:://;
1575
1576        # Open a connection if it's not already open
1577
60
121
        $self->_open() if((!$self->{$table}) && (!$self->{'data'}));
1578
1579
59
69
        return $table;
1580}
1581
1582# Determine whether a given file is a valid Berkeley DB file.
1583# It combines a fast preliminary check with a more thorough validation step for accuracy.
1584# It looks for the magic number at both byte 0 and byte 12
1585# TODO: Combine _db_0 and _db_12 as they are very similar routines
1586sub _is_berkeley_db {
1587
17
30
        my ($self, $file) = @_;
1588
1589        # Step 1: Check magic number
1590
17
189
        open my $fh, '<', $file or return 0;
1591
0
0
        binmode $fh;
1592
1593
0
0
        my $is_db = (($self->_is_berkeley_db_0($fh)) || ($self->_is_berkeley_db_12($fh)));
1594
0
0
        close $fh;
1595
1596
0
0
        if($is_db) {
1597                # Step 2: Attempt to open as Berkeley DB
1598
1599
0
0
                require DB_File && DB_File->import();
1600
1601
0
0
                my %bdb;
1602
0
0
                if(tie %bdb, 'DB_File', $file, O_RDONLY, 0644, $DB_File::DB_HASH) {
1603                        # untie %db;
1604
0
0
                        $self->{'berkeley'} = \%bdb;
1605
0
0
                        return 1;       # Successfully identified as a Berkeley DB file
1606                }
1607        }
1608
0
0
        return 0;
1609}
1610
1611# Determine whether a given file is a valid Berkeley DB file.
1612# It combines a fast preliminary check with a more thorough validation step for accuracy.
1613sub _is_berkeley_db_0
1614{
1615
0
0
        my ($self, $fh) = @_;
1616
1617        # Read the first 4 bytes (magic number)
1618
0
0
        read($fh, my $magic_bytes, 4) == 4 or return 0;
1619
1620        # Unpack both big-endian and little-endian values
1621
0
0
        my $magic_be = unpack('N', $magic_bytes);       # Big-endian
1622
0
0
        my $magic_le = unpack('V', $magic_bytes);       # Little-endian
1623
1624        # Known Berkeley DB magic numbers (in both endian formats)
1625
0
0
0
0
        my %known_magic = map { $_ => 1 } (
1626                0x00061561,     # Btree
1627                0x00053162,     # Hash
1628                0x00042253,     # Queue
1629                0x00052444,     # Recno
1630        );
1631
1632
0
0
        return($known_magic{$magic_be} || $known_magic{$magic_le});
1633}
1634
1635sub _is_berkeley_db_12
1636{
1637
0
0
        my ($self, $fh) = @_;
1638
0
0
        my $header;
1639
1640
0
0
        seek $fh, 12, 0 or return 0;
1641
0
0
        read($fh, $header, 4) or return 0;
1642
1643
0
0
        $header = substr(unpack('H*', $header), 0, 4);
1644
1645        # Berkeley DB magic numbers
1646
0
0
        return($header eq '6115' || $header eq '1561'); # Btree
1647}
1648
1649# Log and remember a message
1650sub _log
1651{
1652
185
237
        my ($self, $level, @messages) = @_;
1653
1654        # FIXME: add caller's function
1655        # if(($level eq 'warn') || ($level eq 'notice')) {
1656
185
185
126
500
                push @{$self->{'messages'}}, { level => $level, message => join('', grep defined, @messages) };
1657        # }
1658
1659
185
326
        if(scalar(@messages) && (my $logger = $self->{'logger'})) {
1660
185
364
                $self->{'logger'}->$level(join('', grep defined, @messages));
1661        }
1662}
1663
1664sub _debug {
1665
137
225
        my $self = shift;
1666
137
154
        $self->_log('debug', @_);
1667}
1668
1669sub _trace {
1670
47
47
        my $self = shift;
1671
47
71
        $self->_log('trace', @_);
1672}
1673
1674# Emit a warning message somewhere
1675sub _warn {
1676
0
0
        my $self = shift;
1677
0
0
        my $params = Params::Get::get_params('warning', @_);
1678
1679
0
0
        $self->_log('warn', $params->{'warning'});
1680
0
0
        Carp::carp(join('', grep defined, $params->{'warning'}));
1681}
1682
1683# Die
1684sub _fatal {
1685
1
0
        my $self = shift;
1686
1
2
        my $params = Params::Get::get_params('warning', @_);
1687
1688
1
8
        $self->_log('error', $params->{'warning'});
1689
1
1223
        Carp::croak(join('', grep defined, $params->{'warning'}));
1690}
1691
1692 - 1744
=head1 AUTHOR

Nigel Horne, C<< <njh at nigelhorne.com> >>

=head1 SUPPORT

This module is provided as-is without any warranty.

Please report any bugs or feature requests to C<bug-database-abstraction at rt.cpan.org>,
or through the web interface at
L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Database-Abstraction>.
I will be notified, and then you'll
automatically be notified of progress on your bug as I make changes.

=head1 BUGS

The default delimiter for CSV files is set to '!', not ',' for historical reasons.
I really ought to fix that.

It would be nice for the key column to be called key, not entry,
however key's a reserved word in SQL.

The no_entry parameter should be no_id.

XML slurping is hard,
so if XML fails for you on a small file force non-slurping mode with

    $foo = MyPackageName::Database::Foo->new({
        directory => '/var/dat',
        max_slurp_size => 0  # force to not use slurp and therefore to use SQL
    });

=head1 SEE ALSO

=over 4

=item * Test coverage report: L<https://nigelhorne.github.io/Database-Abstraction/coverage/>

=back

=head1 LICENSE AND COPYRIGHT

Copyright 2015-2025 Nigel Horne.

This program is released under the following licence: GPL2.
Usage is subject to licence terms.
The licence terms of this software are as follows:
Personal single user, single computer use: GPL2
All other users (for example, Commercial, Charity, Educational, Government)
must apply in writing for a licence for use from Nigel Horne at the
above e-mail.

=cut
1745
17461;