File Coverage

File:blib/lib/Database/Abstraction.pm
Coverage:64.4%

linestmtbrancondsubtimecode
1package Database::Abstraction;
2
3# Author Nigel Horne: njh@nigelhorne.com
4# Copyright (C) 2015-2025, Nigel Horne
5
6# Usage is subject to licence terms.
7# The licence terms of this software are as follows:
8# Personal single user, single computer use: GPL2
9# All other users (for example, Commercial, Charity, Educational, Government)
10#       must apply in writing for a licence for use from Nigel Horne at the
11#       above e-mail.
12
13# TODO: Switch "entry" to off by default, and enable by passing 'entry'
14#       though that wouldn't be so nice for AUTOLOAD
15# TODO: support a directory hierarchy of databases
16# TODO: consider returning an object or array of objects, rather than hashes
17# TODO: Add redis database - could be of use for Geo::Coder::Free
18#       use select() to select a database - use the table arg
19#       new(database => 'redis://servername');
20# TODO: Add a "key" property, defaulting to "entry", which would be the name of the key
21# TODO: The maximum number to return should be tuneable (as a LIMIT)
22# TODO: Add full CRUD support
23# TODO: It would be better for the default sep_char to be ',' rather than '!'
24# FIXME:        t/xml.t fails in slurping mode
25# TODO: Other databases e.g., Redis, noSQL, remote databases such as MySQL, PostgreSQL
26# TODO: The no_entry/entry terminology is confusing.  Replace with no_id/id_column
27# TODO: Add support for DBM::Deep
28# TODO: Log queries and the time that they took to execute per database
29
30
16
16
16
1995192
13
304
use warnings;
31
16
16
16
25
11
144
use strict;
32
33
16
16
16
2256
6796
24
use boolean;
34
16
16
16
481
15
312
use Carp;
35
16
16
16
24
12
224
use Data::Dumper;
36
16
16
16
3051
41549
365
use Data::Reuse;
37
16
16
16
3340
213699
1995
use DBD::SQLite::Constants qw/:file_open/;      # For SQLITE_OPEN_READONLY
38
16
16
16
50
14
1668
use Fcntl;      # For O_RDONLY
39
16
16
16
34
18
322
use File::Spec;
40
16
16
16
2820
3598
343
use File::pfopen 0.03;  # For $mode and list context
41
16
16
16
5514
105965
545
use File::Temp;
42
16
16
16
3357
566606
293
use Log::Abstraction 0.26;
43
16
16
16
3354
70545
261
use Object::Configure 0.16;
44
16
16
16
37
78
234
use Params::Get 0.13;
45# use Error::Simple;    # A nice idea to use this, but it doesn't play well with "use lib"
46
16
16
16
43
21
306
use Scalar::Util;
47
48our %defaults;
49
16
16
16
28
31
53759
use constant    DEFAULT_MAX_SLURP_SIZE => 16 * 1024; # CSV files <= than this size are read into memory
50
51 - 59
=head1 NAME

Database::Abstraction - Read-only Database Abstraction Layer (ORM)

=head1 VERSION

Version 0.33

=cut
60
61our $VERSION = '0.33';
62
63 - 193
=head1 DESCRIPTION

C<Database::Abstraction> is a read-only database abstraction layer (ORM) for Perl,
designed to provide a simple interface for accessing and querying various types of databases such as CSV, XML, and SQLite without the need to write SQL queries.
It promotes code maintainability by abstracting database access logic into a single interface,
allowing users to switch between different storage formats seamlessly.
The module supports caching for performance optimization,
flexible logging for debugging and monitoring,
and includes features like the AUTOLOAD method for convenient access to database columns.
By handling numerous database and file formats,
C<Database::Abstraction> adds versatility and simplifies the management of read-intensive applications.

=head1 SYNOPSIS

Abstract class giving read-only access to CSV,
XML,
BerkeleyDB and SQLite databases via Perl without writing any SQL,
using caching for performance optimization.

The module promotes code maintainability by abstracting database access logic into a single interface.
Users can switch between different storage formats without changing application logic.
The ability to handle numerous database and file formats adds versatility and makes it useful for a variety of applications.

It's a simple ORM like interface which,
for all of its simplicity,
allows you to do a lot of the heavy lifting of simple database operations without any SQL.
It offers functionalities like opening the database and fetching data based on various criteria.

Built-in support for flexible and configurable caching improves performance for read-intensive applications.

Supports logging to debug and monitor database operations.

Look for databases in $directory in this order:

=over 4

=item 1 C<SQLite>

File ends with .sql

=item 2 C<PSV>

Pipe separated file, file ends with .psv

=item 3 C<CSV>

File ends with .csv or .db, can be gzipped. Note the default sep_char is '!' not ','

=item 4 C<XML>

File ends with .xml

=item 5 C<BerkeleyDB>

File ends with .db

=back

The AUTOLOAD feature allows for convenient access to database columns using method calls.
It hides the complexity of querying the underlying data storage.

If the table has a key column,
entries are keyed on that and sorts are based on it.
To turn that off, pass 'no_entry' to the constructor, for legacy
reasons it's enabled by default.
The key column's default name is 'entry', but it can be overridden by the 'id' parameter.

Arrays are made read-only before being returned.
To disable that, pass C<no_fixate> to the constructor.

CSV files that are not no_entry can have empty lines or comment lines starting with '#',
to make them more readable.

=head1 EXAMPLE

If the file /var/dat/foo.csv contains something like:

    "customer_id","name"
    "plugh","John"
    "xyzzy","Jane"

Create a driver for the file in .../Database/foo.pm:

    package Database::foo;

    use Database::Abstraction;

    our @ISA = ('Database::Abstraction');

    # Regular CSV: There is no entry column and the separators are commas
    sub new
    {
        my $class = shift;
        my %args = (ref($_[0]) eq 'HASH') ? %{$_[0]} : @_;

        return $class->SUPER::new(no_entry => 1, sep_char => ',', %args);
    }

You can then use this code to access the data via the driver:

    # Opens the file, e.g. /var/dat/foo.csv
    my $foo = Database::foo->new(directory => '/var/dat');

    # Prints "John"
    print 'Customer name ', $foo->name(customer_id => 'plugh'), "\n";

    # Prints:
    #  $VAR1 = {
    #     'customer_id' => 'xyzzy',
    #     'name' => 'Jane'
    #  };
    my $row = $foo->fetchrow_hashref(customer_id => 'xyzzy');
    print Data::Dumper->new([$row])->Dump();

=head1 SUBROUTINES/METHODS

=head2 init

Initializes the abstraction class and its subclasses with optional arguments for configuration.

    Database::Abstraction::init(directory => '../data');

See the documentation for new to see what variables can be set.

Returns a reference to a hash of the current values.
Therefore when given with no arguments you can get the current default values:

    my $defaults = Database::Abstraction::init();
    print $defaults->{'directory'}, "\n";

=cut
194
195# Subroutine to initialize with args
196sub init
197{
198
46
468524
        if(my $params = Params::Get::get_params(undef, @_)) {
199
44
509
                if(($params->{'expires_in'} && !$params->{'cache_duration'})) {
200                        # Compatibility with CHI
201
1
1
                        $params->{'cache_duration'} = $params->{'expires_in'};
202                }
203
204
44
44
42
90
                %defaults = (%defaults, %{$params});
205
44
148
                $defaults{'cache_duration'} ||= '1 hour';
206        }
207
208
46
910
        return \%defaults
209}
210
211 - 221
=head2 import

The module can be initialised by the C<use> directive.

    use Database::Abstraction 'directory' => '/etc/data';

or

    use Database::Abstraction { 'directory' => '/etc/data' };

=cut
222
223sub import
224{
225
38
2024
        my $pkg = shift;
226
227
38
60
        if((scalar(@_) % 2) == 0) {
228
38
36
                my %h = @_;
229
38
62
                init(Object::Configure::configure($pkg, \%h));
230        } elsif((scalar(@_) == 1) && (ref($_[0]) eq 'HASH')) {
231
0
0
                init(Object::Configure::configure($pkg, $_[0]));
232        } elsif(scalar(@_) > 0) {    # >= 3 would also work here
233
0
0
                init(\@_);
234        }
235}
236
237 - 316
=head2 new

Create an object to point to a read-only database.

Arguments:

Takes different argument formats (hash or positional)

=over 4

=item * C<auto_load>

Enable/disable the AUTOLOAD feature.
The default is to have it enabled.

=item * C<cache>

Place to store results

=item * C<cache_duration>

How long to store results in the cache (default is 1 hour).

=item * C<config_file>

Points to a configuration file which contains the parameters to C<new()>.
The file can be in any common format including C<YAML>, C<XML>, and C<INI>.
This allows the parameters to be set at run time.

=item * C<expires_in>

Synonym of C<cache_duration>, for compatibility with C<CHI>.

=item * C<dbname>

The prefix of name of the database file (default is name of the table).
The database will be held in a file such as $dbname.csv.

=item * C<directory>

Where the database file is held.
If only one argument is given to C<new()>, it is taken to be C<directory>.

=item * C<filename>

Filename containing the data.
When not given,
the filename is derived from the tablename
which in turn comes from the class name.

=item * C<logger>

Takes an optional parameter logger, which is used for warnings and traces.
Can be an object that understands warn() and trace() messages,
such as a L<Log::Log4perl> or L<Log::Any> object,
a reference to code,
or a filename.

=item * C<max_slurp_size>

CSV/PSV/XML files smaller than this are held in a HASH in RAM (default is 16K),
falling back to SQL on larger data sets.
Setting this value to 0 will turn this feature off,
thus forcing SQL to be used to access the database

=back

If the arguments are not set, tries to take from class level defaults.

Checks for abstract class usage.

Slurp mode assumes that the key column (entry) is unique.
If it isn't, searches will be incomplete.
Turn off slurp mode on those databases,
by setting a low value for max_slurp_size.

Clones existing objects with or without modifications.
Uses Carp::carp to log warnings for incorrect usage or potential mistakes.

=cut
317
318sub new {
319
35
95589
        my $class = shift;
320
35
27
        my %args;
321
322        # Handle hash or hashref arguments
323
35
101
        if((scalar(@_) == 1) && !ref($_[0])) {
324
4
5
                $args{'directory'} = $_[0];
325        } elsif(my $params = Params::Get::get_params(undef, @_)) {
326
24
24
292
38
                %args = %{$params};
327        }
328
329
35
157
        if(!defined($class)) {
330
0
0
                if((scalar keys %args) > 0) {
331                        # Using Database::Abstraction->new(), not Database::Abstraction::new()
332
0
0
                        carp(__PACKAGE__, ' use ->new() not ::new() to instantiate');
333
0
0
                        return;
334                }
335                # FIXME: this only works when no arguments are given
336
0
0
                $class = __PACKAGE__;
337        } elsif($class eq __PACKAGE__) {
338
1
11
                croak("$class: abstract class");
339        } elsif(Scalar::Util::blessed($class)) {
340                # If $class is an object, clone it with new arguments
341
2
2
2
8
                return bless { %{$class}, %args }, ref($class);
342        }
343
344        # Load the configuration from a config file, if provided
345
32
32
26
44
        %args = %{Object::Configure::configure($class, \%args)};
346
347        # Validate logger object has required methods
348
32
47953
        if(defined $args{'logger'}) {
349
32
190
                unless(Scalar::Util::blessed($args{'logger'}) && $args{'logger'}->can('info') && $args{'logger'}->can('error')) {
350
0
0
                        Carp::croak("Logger must be an object with info() and error() methods");
351                }
352        }
353
354
32
70
        croak("$class: where are the files?") unless($args{'directory'} || $defaults{'directory'});
355
356
31
187
        croak("$class: ", $args{'directory'} || $defaults{'directory'}, ' is not a directory') unless(-d ($args{'directory'} || $defaults{'directory'}));
357
358        # init(\%args);
359
360        # return bless {
361                # logger => $args{'logger'} || $logger,
362                # directory => $args{'directory'} || $directory,     # The directory containing the tables in XML, SQLite or CSV format
363                # cache => $args{'cache'} || $cache,
364                # cache_duration => $args{'cache_duration'} || $cache_duration || '1 hour',
365                # table => $args{'table'},   # The name of the file containing the table, defaults to the class name
366                # no_entry => $args{'no_entry'} || 0,
367        # }, $class;
368
369        # Re-seen keys take precedence, so defaults come first
370        # print STDERR ">>>>>>>>>>>>>>>>>>>\n";
371        # print STDERR __LINE__, "\n";
372        # print STDERR $args{'id'} || 'undef';
373        # print STDERR "\n";
374
30
131
        return bless {
375                no_entry => 0,
376                no_fixate => 0,
377                id => 'entry',
378                cache_duration => '1 hour',
379                max_slurp_size => DEFAULT_MAX_SLURP_SIZE,
380                %defaults,
381                %args,
382        }, $class;
383}
384
385 - 389
=head2  set_logger

Sets the class, code reference, or file that will be used for logging.

=cut
390
391sub set_logger
392{
393
6
1815
        my $self = shift;
394
6
10
        my $params = Params::Get::get_params('logger', @_);
395
396
5
55
        if(my $logger = $params->{'logger'}) {
397
5
8
                if(Scalar::Util::blessed($logger)) {
398
2
6
                        $self->{'logger'} = $logger;
399                } else {
400
3
5
                        $self->{'logger'} = Log::Abstraction->new($logger);
401                }
402
5
50
                return $self;
403        }
404
0
0
        Carp::croak('Usage: set_logger(logger => $logger)')
405}
406
407# Open the database connection based on the specified type (e.g., SQLite, CSV).
408# Read the data into memory or establish a connection to the database file.
409# column_names allows the column names to be overridden on CSV files
410
411sub _open
412{
413
17
46
        if(!UNIVERSAL::isa((caller)[0], __PACKAGE__)) {
414
0
0
                Carp::croak('Illegal Operation: This method can only be called by a subclass');
415        }
416
417
17
170
        my $self = shift;
418
17
28
        my $params = Params::Get::get_params(undef, @_);
419
420
17
203
        $params->{'sep_char'} ||= $self->{'sep_char'} ? $self->{'sep_char'} : '!';
421
17
37
        my $max_slurp_size = $params->{'max_slurp_size'} || $self->{'max_slurp_size'};
422
423
17
42
        my $table = $self->{'table'} || ref($self);
424
17
33
        $table =~ s/.*:://;
425
426
17
47
        $self->_trace(ref($self), ": _open $table");
427
428
17
669
        return if($self->{$table});
429
430        # Read in the database
431
17
15
        my $dbh;
432
433
17
46
        my $dir = $self->{'directory'} || $defaults{'directory'};
434
17
57
        my $dbname = $self->{'dbname'} || $defaults{'dbname'} || $table;
435
17
103
        my $slurp_file = File::Spec->catfile($dir, "$dbname.sql");
436
437
17
68
        $self->_debug("_open: try to open $slurp_file");
438
439        # Look at various places to find the file and derive the file type from the file's name
440
17
621
        if(-r $slurp_file) {
441                # SQLite file
442
0
0
                require DBI && DBI->import() unless DBI->can('connect');
443
444
0
0
                $dbh = DBI->connect("dbi:SQLite:dbname=$slurp_file", undef, undef, {
445                        sqlite_open_flags => SQLITE_OPEN_READONLY,
446                });
447        }
448
17
111
        if($dbh) {
449
0
0
                $dbh->do('PRAGMA synchronous = OFF');
450
0
0
                $dbh->do('PRAGMA cache_size = -4096');       # Use 4MB cache - negative = KB)
451
0
0
                $dbh->do('PRAGMA journal_mode = OFF');       # Read-only, no journal needed
452
0
0
                $dbh->do('PRAGMA temp_store = MEMORY');      # Store temp data in RAM
453
0
0
                $dbh->do('PRAGMA mmap_size = 1048576');      # Use 1MB memory-mapped I/O
454
0
0
                $dbh->sqlite_busy_timeout(100000);   # 10s
455
0
0
                $self->_debug("read in $table from SQLite $slurp_file");
456
0
0
                $self->{'type'} = 'DBI';
457        } elsif($self->_is_berkeley_db(File::Spec->catfile($dir, "$dbname.db"))) {
458
0
0
                $self->_debug("$table is a BerkeleyDB file");
459
0
0
                $self->{'type'} = 'BerkeleyDB';
460        } else {
461
17
10
                my $fin;
462
17
38
                ($fin, $slurp_file) = File::pfopen::pfopen($dir, $dbname, 'csv.gz:db.gz', '<');
463
17
486
                if(defined($slurp_file) && (-r $slurp_file)) {
464
0
0
                        require Gzip::Faster;
465
0
0
                        Gzip::Faster->import();
466
467
0
0
                        close($fin);
468
0
0
                        $fin = File::Temp->new(SUFFIX => '.csv', UNLINK => 0);
469
0
0
                        print $fin gunzip_file($slurp_file);
470
0
0
                        $slurp_file = $fin->filename();
471
0
0
                        $self->{'temp'} = $slurp_file;
472                } else {
473
17
22
                        ($fin, $slurp_file) = File::pfopen::pfopen($dir, $dbname, 'psv', '<');
474
17
298
                        if(defined($fin)) {
475                                # Pipe separated file
476
4
35
                                $params->{'sep_char'} = '|';
477                        } else {
478                                # CSV file
479
13
23
                                ($fin, $slurp_file) = File::pfopen::pfopen($dir, $dbname, 'csv:db', '<');
480                        }
481                }
482
17
325
                if(my $filename = $self->{'filename'} || $defaults{'filename'}) {
483
1
2
                        $self->_debug("Looking for $filename in $dir");
484
1
14
                        $slurp_file = File::Spec->catfile($dir, $filename);
485                }
486
17
85
                if(defined($slurp_file) && (-r $slurp_file)) {
487
14
57
                        close($fin) if(defined($fin));
488
14
14
                        my $sep_char = $params->{'sep_char'};
489
490
14
37
                        $self->_debug(__LINE__, ' of ', __PACKAGE__, ": slurp_file = $slurp_file, sep_char = $sep_char");
491
492
14
560
                        if($params->{'column_names'}) {
493                                $dbh = DBI->connect("dbi:CSV:db_name=$slurp_file", undef, undef,
494                                        {
495                                                csv_sep_char => $sep_char,
496                                                csv_tables => {
497                                                        $table => {
498
1
4
                                                                col_names => $params->{'column_names'},
499                                                        },
500                                                },
501                                        }
502                                );
503                        } else {
504
13
73
                                $dbh = DBI->connect("dbi:CSV:db_name=$slurp_file", undef, undef, { csv_sep_char => $sep_char});
505                        }
506
14
522669
                        $dbh->{'RaiseError'} = 1;
507
508
14
84
                        $self->_debug("read in $table from CSV $slurp_file");
509
510
14
669
                        $dbh->{csv_tables}->{$table} = {
511                                allow_loose_quotes => 1,
512                                blank_is_undef => 1,
513                                empty_is_undef => 1,
514                                binary => 1,
515                                f_file => $slurp_file,
516                                escape_char => '\\',
517                                sep_char => $sep_char,
518                                # Don't do this, causes "Bizarre copy of HASH
519                                #       in scalar assignment in error_diag
520                                #       RT121127
521                                # auto_diag => 1,
522                                auto_diag => 0,
523                                # Don't do this, it causes "Attempt to free unreferenced scalar"
524                                # callbacks => {
525                                        # after_parse => sub {
526                                                # my ($csv, @rows) = @_;
527                                                # my @rc;
528                                                # foreach my $row(@rows) {
529                                                        # if($row->[0] !~ /^#/) {
530                                                                # push @rc, $row;
531                                                        # }
532                                                # }
533                                                # return @rc;
534                                        # }
535                                # }
536                        };
537
538                        # my %options = (
539                                # allow_loose_quotes => 1,
540                                # blank_is_undef => 1,
541                                # empty_is_undef => 1,
542                                # binary => 1,
543                                # f_file => $slurp_file,
544                                # escape_char => '\\',
545                                # sep_char => $sep_char,
546                        # );
547
548                        # $dbh->{csv_tables}->{$table} = \%options;
549                        # delete $options{f_file};
550
551                        # require Text::CSV::Slurp;
552                        # Text::CSV::Slurp->import();
553                        # $self->{'data'} = Text::CSV::Slurp->load(file => $slurp_file, %options);
554
555                        # Can't slurp when we want to use our own column names as Text::xSV::Slurp has no way to override the names
556                        # FIXME: Text::xSV::Slurp can't cope well with quotes in field contents
557
14
11177
                        if(((-s $slurp_file) <= $max_slurp_size) && !$params->{'column_names'}) {
558
12
53
                                if((-s $slurp_file) == 0) {
559                                        # Empty file
560
0
0
                                        $self->{'data'} = ();
561                                } else {
562
12
1553
                                        require Text::xSV::Slurp;
563
12
24590
                                        Text::xSV::Slurp->import();
564
565
12
35
                                        $self->_debug('slurp in');
566
567
12
623
                                        my $dataref = xsv_slurp(
568                                                shape => 'aoh',
569                                                text_csv => {
570                                                        sep_char => $sep_char,
571                                                        allow_loose_quotes => 1,
572                                                        blank_is_undef => 1,
573                                                        empty_is_undef => 1,
574                                                        binary => 1,
575                                                        escape_char => '\\',
576                                                },
577                                                # string => \join('', grep(!/^\s*(#|$)/, <DATA>))
578                                                file => $slurp_file
579                                        );
580
581                                        # Ignore blank lines or lines starting with # in the CSV file
582
12
39
60
12
2148
46
61
11
                                        my @data = grep { $_->{$self->{'id'}} !~ /^\s*#/ } grep { defined($_->{$self->{'id'}}) } @{$dataref};
583
584
12
21
                                        if($self->{'no_entry'}) {
585                                                # Not keyed, will need to scan each entry
586
3
8
                                                $self->{'data'} = @data;
587                                        } else {
588                                                # keyed on the $self->{'id'} (default: "entry") column
589                                                # while(my $d = shift @data) {
590                                                        # $self->{'data'}->{$d->{$self->{'id'}}} = $d;
591                                                # }
592                                                # Build hash directly from the filtered array, better to use map to avoid data copy
593                                                # and enclose in { } to ensure it's a hash ref
594
9
34
7
48
                                                $self->{'data'} = { map { $_->{$self->{'id'}} => $_ } @data };
595                                        }
596                                }
597                        }
598
14
32
                        $self->{'type'} = 'CSV';
599                } else {
600
3
8
                        $slurp_file = File::Spec->catfile($dir, "$dbname.xml");
601
3
15
                        if(-r $slurp_file) {
602
2
7
                                if((-s $slurp_file) <= $max_slurp_size) {
603
1
371
                                        require XML::Simple;
604
1
3689
                                        XML::Simple->import();
605
606
1
26
                                        my $xml = XMLin($slurp_file);
607
1
1
18279
2
                                        my @keys = keys %{$xml};
608
1
1
                                        my $key = $keys[0];
609
1
1
                                        my @data;
610
1
2
                                        if(ref($xml->{$key}) eq 'ARRAY') {
611
1
1
11
1
                                                @data = @{$xml->{$key}};
612                                        } elsif(ref($xml) eq 'ARRAY') {
613
0
0
0
0
                                                @data = @{$xml};
614                                        } elsif((ref($xml) eq 'HASH') && !$self->{'no_entry'}) {
615
0
0
0
0
                                                if(scalar(keys %{$xml}) == 1) {
616
0
0
                                                        if($xml->{$table}) {
617
0
0
                                                                @data = $xml->{$table};
618                                                        } else {
619
0
0
                                                                die 'TODO: import arbitrary XML with "entry" field';
620                                                        }
621                                                } else {
622
0
0
                                                        die 'TODO: import arbitrary XML (differnt number of keys)';
623                                                }
624                                        } else {
625
0
0
                                                die 'TODO: import arbitrary XML, cannot currently handle ', ref($xml);
626                                        }
627
1
1
                                        $self->{'data'} = ();
628
1
2
                                        if($self->{'no_entry'}) {
629                                                # Not keyed, will need to scan each entry
630
0
0
                                                my $i = 0;
631
0
0
                                                foreach my $d(@data) {
632
0
0
                                                        $self->{'data'}->{$i++} = $d;
633                                                }
634                                        } else {
635                                                # keyed on the $self->{'id'} (default: "entry") column
636
1
1
                                                foreach my $d(@data) {
637
5
7
                                                        $self->{'data'}->{$d->{$self->{'id'}}} = $d;
638                                                }
639                                        }
640                                } else {
641
1
3
                                        $dbh = DBI->connect('dbi:XMLSimple(RaiseError=>1):');
642
1
86742
                                        $dbh->{'RaiseError'} = 1;
643
1
10
                                        $self->_debug("read in $table from XML $slurp_file");
644
1
17
                                        $dbh->func($table, 'XML', $slurp_file, 'xmlsimple_import');
645                                }
646                        } else {
647                                # throw Error(-file => "$dir/$table");
648
1
3
                                $self->_fatal("Can't find a file called '$dbname' for the table $table in $dir");
649                        }
650
2
7
                        $self->{'type'} = 'XML';
651                }
652        }
653
654
16
0
61
0
        Data::Reuse::fixate(%{$self->{'data'}}) if($self->{'data'} && (ref($self->{'data'} eq 'HASH')));
655
656
16
31
        $self->{$table} = $dbh;
657
16
95
        my @statb = stat($slurp_file);
658
16
27
        $self->{'_updated'} = $statb[9];
659
660
16
27
        return $self;
661}
662
663 - 675
=head2  selectall_arrayref

Returns a reference to an array of hash references of all the data meeting
the given criteria.

Note that since this returns an array ref,
optimisations such as "LIMIT 1" will not be used.

Use caching if that is available.

Returns undef if there are no matches.

=cut
676
677sub selectall_arrayref {
678
8
667
        my $self = shift;
679
8
8
        my $params;
680
681
8
19
        if($self->{'no_entry'}) {
682
1
3
                $params = Params::Get::get_params(undef, \@_);
683        } elsif(scalar(@_)) {
684
4
7
                $params = Params::Get::get_params('entry', @_);
685        }
686
687
8
67
        if($self->{'berkeley'}) {
688
0
0
                $self->_fatal(ref($self), ': selectall_arrayref is meaningless on a NoSQL database');
689        }
690
691
8
14
        my $table = $self->_open_table($params);
692
693
8
13
        if($self->{'data'}) {
694
4
4
3
4
                if(scalar(keys %{$params}) == 0) {
695
2
5
                        $self->_trace("$table: selectall_arrayref fast track return");
696
2
31
                        if(ref($self->{'data'}) eq 'HASH') {
697                                # $self->{'data'} looks like this:
698                                #       key1 => {
699                                #               entry => key1,
700                                #               field1 => value1,
701                                #               field2 => value2
702                                #       }, key2 => {
703                                #               entry => key2,
704                                #               field1 => valuea,
705                                #               field2 => valueb
706                                #       }
707
2
2
3
4
                                $self->_debug("$table: returning ", scalar keys %{$self->{'data'}}, ' entries');
708
2
2
22
3
                                if(scalar keys %{$self->{'data'}} <= 10) {
709
2
5
                                        $self->_debug(Dumper($self->{'data'}));
710                                }
711
2
22
                                my @rc;
712
2
2
2
2
                                foreach my $k (keys %{$self->{'data'}}) {
713
9
9
                                        push @rc, $self->{'data'}->{$k};
714                                }
715
2
7
                                return Return::Set::set_return(\@rc, { type => 'arrayref' });
716                        }
717
0
0
                        return Return::Set::set_return($self->{'data'}, { type => 'arrayref'});
718                        # my @rc = values %{$self->{'data'}};
719                        # return @rc;
720
2
6
                } elsif((scalar(keys %{$params}) == 1) && defined($params->{'entry'}) && !$self->{'no_entry'}) {
721
1
2
                        return Return::Set::set_return([$self->{'data'}->{$params->{'entry'}}], { type => 'arrayref' });
722                }
723        }
724
725
5
4
        my $query;
726
5
5
        my $done_where = 0;
727
728
5
15
        if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
729
3
2
                $query = "SELECT * FROM $table WHERE entry IS NOT NULL AND entry NOT LIKE '#%'";
730
3
3
                $done_where = 1;
731        } else {
732
2
4
                $query = "SELECT * FROM $table";
733        }
734
735
5
4
        my @query_args;
736
5
5
4
10
        foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
737
3
3
                my $arg = $params->{$c1};
738
3
8
                if(ref($arg)) {
739                        # throw Error::Simple("$query: argument is not a string: " . ref($arg));
740
0
0
                        $self->_fatal("selectall_arrayref(): $query: argument is not a string: ", ref($arg));
741                }
742
3
4
                if(!defined($arg)) {
743
0
0
                        my @call_details = caller(0);
744                        # throw Error::Simple("$query: value for $c1 is not defined in call from " .
745                                # $call_details[2] . ' of ' . $call_details[1]);
746
0
0
                        Carp::croak("$query: value for $c1 is not defined in call from ",
747                                $call_details[2], ' of ', $call_details[1]);
748                }
749
750
3
3
                my $keyword;
751
3
4
                if($done_where) {
752
3
4
                        $keyword = 'AND';
753                } else {
754
0
0
                        $keyword = 'WHERE';
755
0
0
                        $done_where = 1;
756                }
757
3
6
                if($arg =~ /[%_]/) {
758
0
0
                        $query .= " $keyword $c1 LIKE ?";
759                } else {
760
3
5
                        $query .= " $keyword $c1 = ?";
761                }
762
3
3
                push @query_args, $arg;
763        }
764
5
10
        if(!$self->{no_entry}) {
765
4
6
                $query .= ' ORDER BY ' . $self->{'id'};
766        }
767
768
5
7
        if(defined($query_args[0])) {
769
3
8
                $self->_debug("selectall_arrayref $query: ", join(', ', @query_args));
770        } else {
771
2
3
                $self->_debug("selectall_arrayref $query");
772        }
773
774
5
101
        my $key;
775        my $c;
776
5
9
        if($c = $self->{cache}) {
777
0
0
                $key = ref($self) . "::$query array";
778
0
0
                if(defined($query_args[0])) {
779
0
0
                        $key .= ' ' . join(', ', @query_args);
780                }
781
0
0
                $self->_debug("cache key = '$key'");
782
0
0
                if(my $rc = $c->get($key)) {
783
0
0
                        $self->_debug('cache HIT');
784
0
0
                        return $rc;     # We stored a ref to the array
785
786                        # This use of a temporary variable is to avoid
787                        #       "Implicit scalar context for array in return"
788                        # my @rc = @{$rc};
789                        # return @rc;
790                }
791
0
0
                $self->_debug('cache MISS');
792        } else {
793
5
8
                $self->_debug('cache not used');
794        }
795
796
5
91
        if(my $sth = $self->{$table}->prepare($query)) {
797
5
5808
                $sth->execute(@query_args) ||
798                        # throw Error::Simple("$query: @query_args");
799                        croak("$query: @query_args");
800
801
5
8785
                my $rc;
802
5
29
                while(my $href = $sth->fetchrow_hashref()) {
803
27
27
27
680
50
26
                        push @{$rc}, $href if(scalar keys %{$href});
804                }
805
5
98
                if($c) {
806
0
0
                        $c->set($key, $rc, $self->{'cache_duration'});    # Store a ref to the array
807                }
808
809
5
5
9
9
                Data::Reuse::fixate(@{$rc}) if(!$self->{'no_fixate'});
810
811
5
774
                return $rc;
812        }
813
0
0
        $self->_warn("selectall_array failure on $query: @query_args");
814        # throw Error::Simple("$query: @query_args");
815
0
0
        croak("$query: @query_args");
816
817        # my @rc = grep { defined $_ } $self->selectall_array(@_);
818
819        # return if(scalar(@rc) == 0);
820
821        # Data::Reuse::fixate(@rc) if(!$self->{'no_fixate'});
822        # return \@rc;
823}
824
825 - 829
=head2  selectall_hashref

Deprecated misleading legacy name for selectall_arrayref.

=cut
830
831sub selectall_hashref
832{
833
6
1193
        my $self = shift;
834
6
14
        return $self->selectall_arrayref(@_);
835}
836
837 - 846
=head2  selectall_array

Similar to selectall_array but returns an array of hash references.

Con:    Copies more data around than selectall_arrayref
Pro:    Better determination of list vs scalar mode than selectall_arrayref by setting "LIMIT 1"

TODO:   Remove duplicated code

=cut
847
848sub selectall_array
849{
850
7
6
        my $self = shift;
851
852
7
30
        if($self->{'berkeley'}) {
853
0
0
                Carp::croak(ref($self), ': selectall_array is meaningless on a NoSQL database');
854        }
855
856
7
17
        my $params = Params::Get::get_params(undef, \@_);
857
7
107
        my $table = $self->_open_table($params);
858
859
7
10
        if($self->{'data'}) {
860
4
4
3
8
                if(scalar(keys %{$params}) == 0) {
861
3
7
                        $self->_trace("$table: selectall_array fast track return");
862
3
49
                        if(ref($self->{'data'}) eq 'HASH') {
863
3
3
2
9
                                return values %{$self->{'data'}};
864                        }
865
0
0
0
0
                        return @{$self->{'data'}};
866                        # my @rc = values %{$self->{'data'}};
867                        # return @rc;
868
1
3
                } elsif((scalar(keys %{$params}) == 1) && defined($params->{'entry'}) && !$self->{'no_entry'}) {
869
0
0
                        return $self->{'data'}->{$params->{'entry'}};
870                }
871        }
872
873
4
4
        my $query;
874
4
4
        my $done_where = 0;
875
876
4
12
        if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
877
2
2
                $query = "SELECT * FROM $table WHERE entry IS NOT NULL AND entry NOT LIKE '#%'";
878
2
2
                $done_where = 1;
879        } else {
880
2
3
                $query = "SELECT * FROM $table";
881        }
882
883
4
3
        my @query_args;
884
4
4
5
8
        foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
885
3
3
                my $arg = $params->{$c1};
886
3
4
                if(ref($arg)) {
887                        # throw Error::Simple("$query: argument is not a string: " . ref($arg));
888
0
0
                        $self->_fatal("selectall_array(): $query: argument is not a string: ", ref($arg));
889                }
890
891
3
3
                my $keyword;
892
3
4
                if($done_where) {
893
2
2
                        $keyword = 'AND';
894                } else {
895
1
1
                        $keyword = 'WHERE';
896
1
1
                        $done_where = 1;
897                }
898
3
4
                if(!defined($arg)) {
899
1
13
                        $query .= " $keyword $c1 IS NULL"
900                } else {
901
2
4
                        if($arg =~ /[%_]/) {
902
0
0
                                $query .= " $keyword $c1 LIKE ?";
903                        } else {
904
2
4
                                $query .= " $keyword $c1 = ?";
905                        }
906
2
4
                        push @query_args, $arg;
907                }
908        }
909
4
6
        if(!$self->{no_entry}) {
910
3
4
                $query .= ' ORDER BY ' . $self->{'id'};
911        }
912
4
7
        if(!wantarray) {
913
0
0
                $query .= ' LIMIT 1';
914        }
915
916
4
5
        if(defined($query_args[0])) {
917
2
5
                $self->_debug("selectall_array $query: ", join(', ', @query_args));
918        } else {
919
2
4
                $self->_debug("selectall_array $query");
920        }
921
922
4
73
        my $key;
923        my $c;
924
4
7
        if($c = $self->{cache}) {
925
0
0
                $key = ref($self) . '::' . $query;
926
0
0
                if(wantarray) {
927
0
0
                        $key .= ' array';
928                }
929
0
0
                if(defined($query_args[0])) {
930
0
0
                        $key .= ' ' . join(', ', @query_args);
931                }
932
0
0
                $self->_debug("cache key = '$key'");
933
0
0
                if(my $rc = $c->get($key)) {
934
0
0
                        $self->_debug('cache HIT');
935
0
0
0
0
                        return wantarray ? @{$rc} : $rc;        # We stored a ref to the array
936
937                        # This use of a temporary variable is to avoid
938                        #       "Implicit scalar context for array in return"
939                        # my @rc = @{$rc};
940                        # return @rc;
941                }
942
0
0
                $self->_debug('cache MISS');
943        } else {
944
4
5
                $self->_debug('cache not used');
945        }
946
947
4
67
        if(my $sth = $self->{$table}->prepare($query)) {
948
4
4362
                $sth->execute(@query_args) ||
949                        # throw Error::Simple("$query: @query_args");
950                        croak("$query: @query_args");
951
952
4
7158
                my $rc;
953
4
19
                while(my $href = $sth->fetchrow_hashref()) {
954
22
526
                        return $href if(!wantarray);    # FIXME: Doesn't store in the cache
955
22
22
16
42
                        push @{$rc}, $href;
956                }
957
4
77
                if($c) {
958
0
0
                        $c->set($key, $rc, $self->{'cache_duration'});    # Store a ref to the array
959                }
960
961
4
7
                if($rc) {
962
4
4
6
9
                        Data::Reuse::fixate(@{$rc}) if(!$self->{'no_fixate'});
963
4
4
714
35
                        return @{$rc};
964                }
965
0
0
                return;
966        }
967
0
0
        $self->_warn("selectall_array failure on $query: @query_args");
968        # throw Error::Simple("$query: @query_args");
969
0
0
        croak("$query: @query_args");
970}
971
972 - 976
=head2  selectall_hash

Deprecated misleading legacy name for selectall_array.

=cut
977
978sub selectall_hash
979{
980
7
1812
        my $self = shift;
981
7
18
        return $self->selectall_array(@_);
982}
983
984 - 988
=head2  count

Return the number items/rows matching the given criteria

=cut
989
990sub count
991{
992
5
727
        my $self = shift;
993
994
5
8
        if($self->{'berkeley'}) {
995
0
0
                Carp::croak(ref($self), ': count is meaningless on a NoSQL database');
996        }
997
998
5
15
        my $params = Params::Get::get_params(undef, \@_);
999
5
53
        my $table = $self->_open_table($params);
1000
1001
5
9
        if($self->{'data'}) {
1002
3
3
20
7
                if(scalar(keys %{$params}) == 0) {
1003
2
4
                        $self->_trace("$table: count fast track return");
1004
2
31
                        if(ref($self->{'data'}) eq 'HASH') {
1005
2
2
1
5
                                return scalar keys %{$self->{'data'}};
1006                        }
1007
0
0
0
0
                        return scalar @{$self->{'data'}};
1008
1
2
                } elsif((scalar(keys %{$params}) == 1) && defined($params->{'entry'}) && !$self->{'no_entry'}) {
1009
0
0
                        return $self->{'data'}->{$params->{'entry'}} ? 1 : 0;
1010                }
1011        }
1012
1013
3
2
        my $query;
1014
3
3
        my $done_where = 0;
1015
1016
3
17
        if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1017
1
1
                $query = "SELECT COUNT(*) FROM $table WHERE entry IS NOT NULL AND entry NOT LIKE '#%'";
1018
1
1
                $done_where = 1;
1019        } elsif($self->{no_entry}) {
1020
2
2
                $query = "SELECT COUNT(*) FROM $table";
1021        } else {
1022
0
0
                $query = "SELECT COUNT(entry) FROM $table";
1023        }
1024
1025
3
1
        my @query_args;
1026
3
3
3
5
        foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
1027
1
1
                my $arg = $params->{$c1};
1028
1
1
                if(ref($arg)) {
1029                        # throw Error::Simple("$query: argument is not a string: " . ref($arg));
1030
0
0
                        $self->_fatal("count(): $query: argument is not a string: ", ref($arg));
1031                }
1032
1
1
                if(!defined($arg)) {
1033
0
0
                        my @call_details = caller(0);
1034                        # throw Error::Simple("$query: value for $c1 is not defined in call from " .
1035                                # $call_details[2] . ' of ' . $call_details[1]);
1036
0
0
                        Carp::croak("$query: value for $c1 is not defined in call from ",
1037                                $call_details[2], ' of ', $call_details[1]);
1038                }
1039
1040
1
1
                my $keyword;
1041
1
1
                if($done_where) {
1042
1
1
                        $keyword = 'AND';
1043                } else {
1044
0
0
                        $keyword = 'WHERE';
1045
0
0
                        $done_where = 1;
1046                }
1047
1
1
                if($arg =~ /[%_]/) {
1048
0
0
                        $query .= " $keyword $c1 LIKE ?";
1049                } else {
1050
1
2
                        $query .= " $keyword $c1 = ?";
1051                }
1052
1
1
                push @query_args, $arg;
1053        }
1054
3
5
        if(!$self->{no_entry}) {
1055
1
1
                $query .= ' ORDER BY ' . $self->{'id'};
1056        }
1057
1058
3
4
        if(defined($query_args[0])) {
1059
1
2
                $self->_debug("count $query: ", join(', ', @query_args));
1060        } else {
1061
2
4
                $self->_debug("count $query");
1062        }
1063
1064
3
40
        my $key;
1065        my $c;
1066
3
20
        if($c = $self->{cache}) {
1067
0
0
                $key = ref($self) . '::' . $query;
1068
0
0
                $key =~ s/COUNT\((.+?)\)/$1/;
1069
0
0
                $key .= ' array';
1070
0
0
                if(defined($query_args[0])) {
1071
0
0
                        $key .= ' ' . join(', ', @query_args);
1072                }
1073
0
0
                if(my $rc = $c->get($key)) {
1074                        # Unlikely
1075
0
0
                        $self->_debug('cache HIT');
1076
0
0
0
0
                        return scalar @{$rc};   # We stored a ref to the array
1077                }
1078
0
0
                $self->_debug('cache MISS');
1079        } else {
1080
3
5
                $self->_debug('cache not used');
1081        }
1082
1083
3
86
        if(my $sth = $self->{$table}->prepare($query)) {
1084
3
1686
                $sth->execute(@query_args) ||
1085                        # throw Error::Simple("$query: @query_args");
1086                        croak("$query: @query_args");
1087
1088
3
3729
                my $count = $sth->fetchrow_arrayref()->[0];
1089
1090
3
54
                $sth->finish();
1091
1092
3
14
                return $count;
1093        }
1094
0
0
        $self->_warn("count failure on $query: @query_args");
1095        # throw Error::Simple("$query: @query_args");
1096
0
0
        croak("$query: @query_args");
1097}
1098
1099 - 1112
=head2  fetchrow_hashref

Returns a hash reference for a single row in a table.

It searches for the given arguments, searching IS NULL if the value is C<undef>

   my $res = $foo->fetchrow_hashref(entry => 'one');

Special argument: table: determines the table to read from if not the default,
which is worked out from the class name

When no_entry is not set allow just one argument to be given: the entry value.

=cut
1113
1114sub fetchrow_hashref {
1115
8
1195
        my $self = shift;
1116
1117
8
24
        $self->_trace('Entering fetchrow_hashref');
1118
1119
8
119
        my $params;
1120
1121
8
13
        if(!$self->{'no_entry'}) {
1122
7
26
                $params = Params::Get::get_params('entry', @_);
1123        } else {
1124
1
4
                $params = Params::Get::get_params(undef, @_);
1125        }
1126
1127
8
105
        my $table = $self->_open_table($params);
1128
1129        # ::diag($self->{'type'});
1130
8
7
18
19
        if($self->{'data'} && (!$self->{'no_entry'}) && (scalar keys(%{$params}) == 1) && defined($params->{'entry'})) {
1131
4
5
                $self->_debug('Fast return from slurped data');
1132
4
53
                return $self->{'data'}->{$params->{'entry'}};
1133        }
1134
1135
4
8
        if($self->{'berkeley'}) {
1136                # print STDERR ">>>>>>>>>>>>\n";
1137                # ::diag(Data::Dumper->new([$self->{'berkeley'}])->Dump());
1138
0
0
0
0
                if((!$self->{'no_entry'}) && (scalar keys(%{$params}) == 1) && defined($params->{'entry'})) {
1139
0
0
                        return { entry => $self->{'berkeley'}->{$params->{'entry'}} };
1140                }
1141
0
0
                my $id = $self->{'id'};
1142
0
0
0
0
                if($self->{'no_entry'} && (scalar keys(%{$params}) == 1) && defined($id) && defined($params->{$id})) {
1143
0
0
                        if(my $rc = $self->{'berkeley'}->{$params->{$id}}) {
1144
0
0
                                return { $params->{$id} => $rc }  # Return key->value as a hash pair
1145                        }
1146
0
0
                        return;
1147                }
1148
0
0
                Carp::croak(ref($self), ': fetchrow_hashref is meaningless on a NoSQL database');
1149        }
1150
1151
4
5
        my $query = 'SELECT * FROM ';
1152
4
5
        if(my $t = delete $params->{'table'}) {
1153
0
0
                $query .= $t;
1154        } else {
1155
4
5
                $query .= $table;
1156        }
1157
4
5
        my $done_where = 0;
1158
1159
4
11
        if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1160
3
3
                $query .= ' WHERE ' . $self->{'id'} . ' IS NOT NULL AND ' . $self->{'id'} . " NOT LIKE '#%'";
1161
3
3
                $done_where = 1;
1162        }
1163
4
3
        my @query_args;
1164
4
4
2
9
        foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
1165
4
2
                my $keyword;
1166
1167
4
4
                if($done_where) {
1168
3
3
                        $keyword = 'AND';
1169                } else {
1170
1
1
                        $keyword = 'WHERE';
1171
1
1
                        $done_where = 1;
1172                }
1173
4
6
                if(my $arg = $params->{$c1}) {
1174
3
4
                        if(ref($arg)) {
1175                                # throw Error::Simple("$query: argument is not a string: " . ref($arg));
1176
0
0
                                $self->_fatal("fetchrow_hash(): $query: argument is not a string: ", ref($arg));
1177                        }
1178
1179
3
6
                        if($arg =~ /[%_]/) {
1180
0
0
                                $query .= " $keyword $c1 LIKE ?";
1181                        } else {
1182
3
4
                                $query .= " $keyword $c1 = ?";
1183                        }
1184
3
6
                        push @query_args, $arg;
1185                } else {
1186
1
1
                        $query .= " $keyword $c1 IS NULL";
1187                        # my @call_details = caller(0);
1188                        # # throw Error::Simple("$query: value for $c1 is not defined in call from " .
1189                                # # $call_details[2] . ' of ' . $call_details[1]);
1190                        # Carp::croak("$query: value for $c1 is not defined in call from ",
1191                                # $call_details[2], ' of ', $call_details[1]);
1192                }
1193        }
1194        # $query .= ' ORDER BY entry LIMIT 1';
1195
4
5
        $query .= ' LIMIT 1';
1196
4
5
        if(defined($query_args[0])) {
1197
3
5
                my @call_details = caller(0);
1198
3
35
                $self->_debug("fetchrow_hashref $query: ", join(', ', @query_args),
1199                        ' called from ', $call_details[2], ' of ', $call_details[1]);
1200        } else {
1201
1
1
                $self->_debug("fetchrow_hashref $query");
1202        }
1203
4
59
        my $key = ref($self) . '::';
1204
4
5
        if(defined($query_args[0])) {
1205
3
7
                if(wantarray) {
1206
0
0
                        $key .= 'array ';
1207                }
1208
3
8
                $key .= "fetchrow $query " . join(', ', @query_args);
1209        } else {
1210
1
1
                $key .= "fetchrow $query";
1211        }
1212
4
4
        my $c;
1213
4
6
        if($c = $self->{cache}) {
1214
0
0
                if(my $rc = $c->get($key)) {
1215
0
0
                        if(wantarray) {
1216
0
0
                                if(ref($rc) eq 'ARRAY') {
1217
0
0
0
0
                                        return @{$rc};  # We stored a ref to the array
1218                                }
1219                        } else {
1220
0
0
                                return $rc;
1221                        }
1222                }
1223        }
1224
1225
4
20
        my $sth = $self->{$table}->prepare($query) or die $self->{$table}->errstr();
1226        # $sth->execute(@query_args) || throw Error::Simple("$query: @query_args");
1227
4
10417
        $sth->execute(@query_args) || croak("$query: @query_args");
1228
4
4428
        my $rc = $sth->fetchrow_hashref();
1229
4
154
        if($c) {
1230
0
0
                if($rc) {
1231
0
0
                        $self->_debug("stash $key=>$rc in the cache for ", $self->{'cache_duration'});
1232
0
0
                        $self->_debug("returns ", Data::Dumper->new([$rc])->Dump());
1233                } else {
1234
0
0
                        $self->_debug("Stash $key=>undef in the cache for ", $self->{'cache_duration'});
1235                }
1236
0
0
                $c->set($key, $rc, $self->{'cache_duration'});
1237        }
1238
4
13
        return $rc;
1239}
1240
1241 - 1256
=head2  execute

Execute the given SQL query on the database.
In an array context, returns an array of hash refs,
in a scalar context returns a hash of the first row

On CSV tables without no_entry, it may help to add
"WHERE entry IS NOT NULL AND entry NOT LIKE '#%'"
to the query.

If the data have been slurped,
this will still work by accessing that actual database.

If "args" is given, it's an array of the arguments (see C<execute()> in L<DBI>).

=cut
1257
1258sub execute
1259{
1260
3
1284
        my $self = shift;
1261
1262
3
7
        if($self->{'berkeley'}) {
1263
0
0
                Carp::croak(ref($self), ': execute is meaningless on a NoSQL database');
1264        }
1265
1266
3
6
        my $args = Params::Get::get_params('query', @_);
1267
1268        # Ensure the 'query' parameter is provided
1269        Carp::croak(__PACKAGE__, ': Usage: execute(query => $query)')
1270
3
43
                unless defined $args->{'query'};
1271
1272
3
5
        my $table = $self->_open_table($args);
1273
1274
3
4
        my $query = $args->{'query'};
1275
1276        # Append "FROM <table>" if missing
1277
3
8
        $query .= " FROM $table" unless $query =~ /\sFROM\s/i;
1278
1279        # Log the query if a logger is available
1280
3
7
        $self->_debug("execute $query");
1281
1282        # Prepare and execute the query
1283
3
64
        my $sth = $self->{$table}->prepare($query);
1284
3
1778
        if(exists($args->{args})) {
1285
1
3
                $sth->execute($args->{args}) or croak("$query: ", join(', ', $args->{args}));        # Die with the query in case of error
1286        } else {
1287
2
5
                $sth->execute() or croak($query);    # Die with the query in case of error
1288        }
1289
1290        # Fetch the results
1291
3
2837
        my @results;
1292
3
11
        while (my $row = $sth->fetchrow_hashref()) {
1293                # Return a single hashref if scalar context is expected
1294
9
231
                return $row unless wantarray;
1295
9
17
                push @results, $row;
1296        }
1297
1298        # Return all rows as an array in list context
1299
3
63
        return @results;
1300}
1301
1302 - 1306
=head2 updated

Returns the timestamp of the last database update.

=cut
1307
1308sub updated {
1309
1
390
        my $self = shift;
1310
1311
1
2
        return $self->{'_updated'};
1312}
1313
1314 - 1335
=head2 AUTOLOAD

Directly access a database column.

Returns all entries in a column, a single entry based on criteria.
Uses cached data if available.

Returns an array of the matches,
or only the first when called in scalar context

If the database has a column called "entry" you can do a quick lookup with

    my $value = $foo->column('123'); # where "column" is the value you're after

    my @entries = $foo->entry();
    print 'There are ', scalar(@entries), " entries in the database\n";

Set distinct or unique to 1 if you're after a unique list.

Throws an error in slurp mode when an invalid column name is given.

=cut
1336
1337sub AUTOLOAD {
1338
30
6238
        our $AUTOLOAD;
1339
30
119
        my ($column) = $AUTOLOAD =~ /::(\w+)$/;
1340
1341
30
45
        return if($column eq 'DESTROY');
1342
1343
30
51
        my $self = shift or return;
1344
1345
30
37
        Carp::croak(__PACKAGE__, ": Unknown column $column") if(!ref($self));
1346
1347        # Allow the AUTOLOAD feature to be disabled
1348
30
47
        Carp::croak(__PACKAGE__, ": Unknown column $column") if(exists($self->{'auto_load'}) && boolean($self->{'auto_load'})->isFalse());
1349
1350        # Validate column name - only allow safe column name
1351
30
57
        Carp::croak(__PACKAGE__, ": Invalid column name: $column") unless $column =~ /^[a-zA-Z_][a-zA-Z0-9_]*$/;
1352
1353
30
46
        my $table = $self->_open_table();
1354
1355
29
20
        my %params;
1356
29
61
        if(ref($_[0]) eq 'HASH') {
1357
2
2
2
2
                %params = %{$_[0]};
1358        } elsif((scalar(@_) % 2) == 0) {
1359
11
14
                %params = @_;
1360        } elsif(scalar(@_) == 1) {
1361                # Don't error on key-value databases, since there's no idea of columns
1362
16
24
                if($self->{'no_entry'} && !$self->{'berkeley'}) {
1363
0
0
                        Carp::croak(ref($self), "::($_[0]): ", $self->{'id'}, ' is not a column');
1364                }
1365
16
18
                $params{'entry'} = shift;
1366        }
1367
1368
29
32
        if($self->{'berkeley'}) {
1369
0
0
                if(my $id = $self->{'id'}) {
1370
0
0
                        return $self->{'berkeley'}->{$params{$id}};
1371                }
1372
0
0
                return $self->{'berkeley'}->{$params{'entry'}};
1373        }
1374
1375
29
33
        croak('Where did the data come from?') if(!defined($self->{'type'}));
1376
29
24
        my $query;
1377
29
18
        my $done_where = 0;
1378
29
54
        my $distinct = delete($params{'distinct'}) || delete($params{'unique'});
1379
1380
29
41
        if(wantarray && !$distinct) {
1381
5
17
                if(((scalar keys %params) == 0) && (my $data = $self->{'data'})) {
1382                        # Return all the entries in the column
1383
3
12
3
3
17
5
                        return map { $_->{$column} } values %{$data};
1384                }
1385
2
4
                if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1386
1
1
                        $query = "SELECT $column FROM $table WHERE " . $self->{'id'} . " IS NOT NULL AND entry NOT LIKE '#%'";
1387
1
1
                        $done_where = 1;
1388                } else {
1389
1
2
                        $query = "SELECT $column FROM $table";
1390                }
1391        } else {
1392
24
36
                if(my $data = $self->{'data'}) {
1393                        # The data has been read in using Text::xSV::Slurp,
1394                        #       so no need to do any SQL
1395
18
24
                        $self->_debug('AUTOLOAD using slurped data');
1396
18
367
                        if($self->{'no_entry'}) {
1397
0
0
                                $self->_debug('no_entry is set');
1398
0
0
                                my ($key, $value) = %params;
1399
0
0
                                if(defined($key)) {
1400
0
0
                                        $self->_debug("key = $key, value = $value, column = $column");
1401
0
0
0
0
                                        foreach my $row(@{$data}) {
1402
0
0
                                                if(defined($row->{$key}) && ($row->{$key} eq $value) && (my $rc = $row->{$column})) {
1403
0
0
                                                        if(defined($rc)) {
1404
0
0
                                                                $self->_trace(__LINE__, ": AUTOLOAD $key: return '$rc' from slurped data");
1405                                                        } else {
1406
0
0
                                                                $self->_trace(__LINE__, ": AUTOLOAD $key: return undef from slurped data");
1407                                                        }
1408
0
0
                                                        return $rc
1409                                                }
1410                                        }
1411
0
0
                                        $self->_debug('not found in slurped data');
1412                                }
1413                        } elsif(((scalar keys %params) == 1) && defined(my $key = $params{'entry'})) {
1414                                # Look up the key
1415
1416                                # This weird code is to stop the data hash becoming polluted with empty
1417                                #       values as we look things up
1418                                # my $rc = $data->{$key}->{$column};
1419
14
11
                                my $rc;
1420
14
23
                                if(defined(my $hash = $data->{$key})) {
1421                                        # Look up the key
1422
10
14
                                        if(!exists($hash->{$column})) {
1423
1
7
                                                Carp::croak(__PACKAGE__, ": There is no column $column in $table");
1424                                        }
1425
9
9
                                        $rc = $hash->{$column};
1426                                }
1427
13
18
                                if(defined($rc)) {
1428
8
22
                                        $self->_trace(__LINE__, ": AUTOLOAD $key: return '$rc' from slurped data");
1429                                } else {
1430
5
19
                                        $self->_trace(__LINE__, ": AUTOLOAD $key: return undef from slurped data");
1431                                }
1432
13
221
                                return $rc
1433                        } elsif((scalar keys %params) == 0) {
1434
2
2
                                if(wantarray) {
1435
2
3
                                        if($distinct) {
1436                                                # https://stackoverflow.com/questions/7651/how-do-i-remove-duplicate-items-from-an-array-in-perl
1437
2
7
7
2
2
7
7
2
                                                my %h = map { $_, 1 } map { $_->{$column} } values %{$data};
1438
2
6
                                                return keys %h;
1439                                        }
1440
0
0
0
0
0
0
                                        return map { $_->{$column} } values %{$data}
1441                                }
1442                                # FIXME - this works but really isn't the right way to do it
1443
0
0
0
0
                                foreach my $v (values %{$data}) {
1444
0
0
                                        return $v->{$column}
1445                                }
1446                        } else {
1447                                # It's keyed, but we're not querying off it
1448
2
3
                                my ($key, $value) = %params;
1449
2
2
1
3
                                foreach my $row (values %{$data}) {
1450
3
9
                                        if(defined($row->{$key}) && ($row->{$key} eq $value) && (my $rc = $row->{$column})) {
1451
2
3
                                                if(defined($rc)) {
1452
2
4
                                                        $self->_trace(__LINE__, ": AUTOLOAD $key: return '$rc' from slurped data");
1453                                                } else {
1454
0
0
                                                        $self->_trace(__LINE__, ": AUTOLOAD $key: return undef from slurped data");
1455                                                }
1456
2
41
                                                return $rc
1457                                        }
1458                                }
1459                        }
1460                        return
1461
0
0
                }
1462                # Data has not been slurped in
1463
6
14
                if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1464
0
0
                        $query = "SELECT DISTINCT $column FROM $table WHERE " . $self->{'id'} . " IS NOT NULL AND entry NOT LIKE '#%'";
1465
0
0
                        $done_where = 1;
1466                } else {
1467
6
8
                        $query = "SELECT DISTINCT $column FROM $table";
1468                }
1469        }
1470
8
6
        my @args;
1471
8
17
        while(my ($key, $value) = each %params) {
1472
6
12
                $self->_debug(__PACKAGE__, ": AUTOLOAD adding key/value pair $key=>$value");
1473
6
382
                if(defined($value)) {
1474
6
7
                        if($done_where) {
1475
1
1
                                $query .= " AND $key = ?";
1476                        } else {
1477
5
4
                                $query .= " WHERE $key = ?";
1478
5
5
                                $done_where = 1;
1479                        }
1480
6
14
                        push @args, $value;
1481                } else {
1482
0
0
                        $self->_debug("AUTOLOAD params $key isn't defined");
1483
0
0
                        if($done_where) {
1484
0
0
                                $query .= " AND $key IS NULL";
1485                        } else {
1486
0
0
                                $query .= " WHERE $key IS NULL";
1487
0
0
                                $done_where = 1;
1488                        }
1489                }
1490        }
1491
8
8
        if(wantarray) {
1492
3
3
                $query .= " ORDER BY $column";
1493        } else {
1494
5
5
                $query .= ' LIMIT 1';
1495        }
1496
8
15
        if(scalar(@args) && $args[0]) {
1497
6
13
                $self->_debug("AUTOLOAD $query: ", join(', ', @args));
1498        } else {
1499
2
9
                $self->_debug("AUTOLOAD $query");
1500        }
1501
8
352
        my $cache;
1502
8
10
        my $key = ref($self) . '::';
1503
8
11
        if($cache = $self->{cache}) {
1504
0
0
                if(wantarray) {
1505
0
0
                        $key .= 'array ';
1506                }
1507
0
0
                if(defined($args[0])) {
1508
0
0
                        $key .= "fetchrow $query " . join(', ', @args);
1509                } else {
1510
0
0
                        $key .= "fetchrow $query";
1511                }
1512
0
0
                if(my $rc = $cache->get($key)) {
1513
0
0
                        $self->_debug('cache HIT');
1514
0
0
0
0
                        return wantarray ? @{$rc} : $rc;        # We stored a ref to the array
1515                }
1516
0
0
                $self->_debug('cache MISS');
1517        } else {
1518
8
9
                $self->_debug('cache not used');
1519        }
1520        # my $sth = $self->{$table}->prepare($query) || throw Error::Simple($query);
1521
8
373
        my $sth = $self->{$table}->prepare($query) || croak($query);
1522        # $sth->execute(@args) || throw Error::Simple($query);
1523
8
13712
        $sth->execute(@args) || croak($query);
1524
1525
8
18280
        if(wantarray) {
1526
3
10
3
3
175
12
                my @rc = map { $_->[0] } @{$sth->fetchall_arrayref()};
1527
3
15
                if($cache) {
1528
0
0
                        $cache->set($key, \@rc, $self->{'cache_duration'});       # Store a ref to the array
1529                }
1530
3
8
                Data::Reuse::fixate(@rc) if(!$self->{'no_fixate'});
1531
3
181
                return @rc;
1532        }
1533
5
31
        my $rc = $sth->fetchrow_array();     # Return the first match only
1534
5
86
        $sth->finish();
1535
5
19
        if($cache) {
1536
0
0
                return $cache->set($key, $rc, $self->{'cache_duration'});
1537        }
1538
5
19
        return $rc;
1539}
1540
1541sub DESTROY
1542{
1543
31
8974
        if(defined($^V) && ($^V ge 'v5.14.0')) {
1544
31
54
                return if ${^GLOBAL_PHASE} eq 'DESTRUCT';       # >= 5.14.0 only
1545        }
1546
31
45
        my $self = shift;
1547
1548        # Clean up temporary file
1549
31
48
        if($self->{'temp'}) {
1550
0
0
                unlink delete $self->{'temp'};
1551        }
1552
1553        # Clean up database handles
1554
31
65
        my $table_name = $self->{'table'} || ref($self);
1555
31
107
        $table_name =~ s/.*:://;
1556
1557
31
50
        if(my $dbh = delete $self->{$table_name}) {
1558
14
122
                $dbh->disconnect() if $dbh->can('disconnect');
1559
14
294
                $dbh->finish() if $dbh->can('finish');
1560        }
1561
1562        # Clean up Berkeley DB
1563
31
44
        if($self->{'berkeley'}) {
1564
0
0
                eval {
1565
0
0
0
0
                        untie %{$self->{'berkeley'}};
1566                };
1567
0
0
                delete $self->{'berkeley'};
1568        }
1569
1570        # Clear all other attributes to break potential circular references
1571
31
54
        foreach my $key (keys %$self) {
1572
292
777
                delete $self->{$key};
1573        }
1574}
1575
1576# Determine the table and open the database
1577sub _open_table
1578{
1579
61
60
        my($self, $params) = @_;
1580
1581        # Get table name (remove package name prefix if present)
1582
61
191
        my $table = $params->{'table'} || $self->{'table'} || ref($self);
1583
61
139
        $table =~ s/.*:://;
1584
1585        # Open a connection if it's not already open
1586
61
138
        $self->_open() if((!$self->{$table}) && (!$self->{'data'}));
1587
1588
60
84
        return $table;
1589}
1590
1591# Determine whether a given file is a valid Berkeley DB file.
1592# It combines a fast preliminary check with a more thorough validation step for accuracy.
1593# It looks for the magic number at both byte 0 and byte 12
1594# TODO: Combine _db_0 and _db_12 as they are very similar routines
1595sub _is_berkeley_db {
1596
17
17
        my ($self, $file) = @_;
1597
1598        # Step 1: Check magic number
1599
17
200
        open my $fh, '<', $file or return 0;
1600
0
0
        binmode $fh;
1601
1602
0
0
        my $is_db = (($self->_is_berkeley_db_0($fh)) || ($self->_is_berkeley_db_12($fh)));
1603
0
0
        close $fh;
1604
1605
0
0
        if($is_db) {
1606                # Step 2: Attempt to open as Berkeley DB
1607
1608
0
0
                require DB_File && DB_File->import();
1609
1610
0
0
                my %bdb;
1611
0
0
                if(tie %bdb, 'DB_File', $file, O_RDONLY, 0644, $DB_File::DB_HASH) {
1612                        # untie %db;
1613
0
0
                        $self->{'berkeley'} = \%bdb;
1614
0
0
                        return 1;       # Successfully identified as a Berkeley DB file
1615                }
1616        }
1617
0
0
        return 0;
1618}
1619
1620# Determine whether a given file is a valid Berkeley DB file.
1621# It combines a fast preliminary check with a more thorough validation step for accuracy.
1622sub _is_berkeley_db_0
1623{
1624
0
0
        my ($self, $fh) = @_;
1625
1626        # Read the first 4 bytes (magic number)
1627
0
0
        read($fh, my $magic_bytes, 4) == 4 or return 0;
1628
1629        # Unpack both big-endian and little-endian values
1630
0
0
        my $magic_be = unpack('N', $magic_bytes);       # Big-endian
1631
0
0
        my $magic_le = unpack('V', $magic_bytes);       # Little-endian
1632
1633        # Known Berkeley DB magic numbers (in both endian formats)
1634
0
0
0
0
        my %known_magic = map { $_ => 1 } (
1635                0x00061561,     # Btree
1636                0x00053162,     # Hash
1637                0x00042253,     # Queue
1638                0x00052444,     # Recno
1639        );
1640
1641
0
0
        return($known_magic{$magic_be} || $known_magic{$magic_le});
1642}
1643
1644sub _is_berkeley_db_12
1645{
1646
0
0
        my ($self, $fh) = @_;
1647
0
0
        my $header;
1648
1649
0
0
        seek $fh, 12, 0 or return 0;
1650
0
0
        read($fh, $header, 4) or return 0;
1651
1652
0
0
        $header = substr(unpack('H*', $header), 0, 4);
1653
1654        # Berkeley DB magic numbers
1655
0
0
        return($header eq '6115' || $header eq '1561'); # Btree
1656}
1657
1658# Log and remember a message
1659sub _log
1660{
1661
186
228
        my ($self, $level, @messages) = @_;
1662
1663        # FIXME: add caller's function
1664        # if(($level eq 'warn') || ($level eq 'notice')) {
1665
186
186
123
502
                push @{$self->{'messages'}}, { level => $level, message => join('', grep defined, @messages) };
1666        # }
1667
1668
186
349
        if(scalar(@messages) && (my $logger = $self->{'logger'})) {
1669
186
381
                $self->{'logger'}->$level(join('', grep defined, @messages));
1670        }
1671}
1672
1673sub _debug {
1674
138
235
        my $self = shift;
1675
138
164
        $self->_log('debug', @_);
1676}
1677
1678sub _trace {
1679
47
38
        my $self = shift;
1680
47
68
        $self->_log('trace', @_);
1681}
1682
1683# Emit a warning message somewhere
1684sub _warn {
1685
0
0
        my $self = shift;
1686
0
0
        my $params = Params::Get::get_params('warning', @_);
1687
1688
0
0
        $self->_log('warn', $params->{'warning'});
1689
0
0
        Carp::carp(join('', grep defined, $params->{'warning'}));
1690}
1691
1692# Die
1693sub _fatal {
1694
1
1
        my $self = shift;
1695
1
1
        my $params = Params::Get::get_params('warning', @_);
1696
1697
1
9
        $self->_log('error', $params->{'warning'});
1698
1
213
        Carp::croak(join('', grep defined, $params->{'warning'}));
1699}
1700
1701 - 1753
=head1 AUTHOR

Nigel Horne, C<< <njh at nigelhorne.com> >>

=head1 SUPPORT

This module is provided as-is without any warranty.

Please report any bugs or feature requests to C<bug-database-abstraction at rt.cpan.org>,
or through the web interface at
L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Database-Abstraction>.
I will be notified, and then you'll
automatically be notified of progress on your bug as I make changes.

=head1 BUGS

The default delimiter for CSV files is set to '!', not ',' for historical reasons.
I really ought to fix that.

It would be nice for the key column to be called key, not entry,
however key's a reserved word in SQL.

The no_entry parameter should be no_id.

XML slurping is hard,
so if XML fails for you on a small file force non-slurping mode with

    $foo = MyPackageName::Database::Foo->new({
        directory => '/var/dat',
        max_slurp_size => 0  # force to not use slurp and therefore to use SQL
    });

=head1 SEE ALSO

=over 4

=item * Test coverage report: L<https://nigelhorne.github.io/Database-Abstraction/coverage/>

=back

=head1 LICENSE AND COPYRIGHT

Copyright 2015-2025 Nigel Horne.

This program is released under the following licence: GPL2.
Usage is subject to licence terms.
The licence terms of this software are as follows:
Personal single user, single computer use: GPL2
All other users (for example, Commercial, Charity, Educational, Government)
must apply in writing for a licence for use from Nigel Horne at the
above e-mail.

=cut
1754
17551;