File Coverage

File:	blib/lib/Database/Abstraction.pm
Coverage:	64.5%

line	stmt	bran	cond	sub	time	code
1						package Database::Abstraction;
2
3						# Author Nigel Horne: njh@nigelhorne.com
4						# Copyright (C) 2015-2025, Nigel Horne
5
6						# Usage is subject to licence terms.
7						# The licence terms of this software are as follows:
8						# Personal single user, single computer use: GPL2
9						# All other users (for example, Commercial, Charity, Educational, Government)
10						# must apply in writing for a licence for use from Nigel Horne at the
11						# above e-mail.
12
13						# TODO: Switch "entry" to off by default, and enable by passing 'entry'
14						# though that wouldn't be so nice for AUTOLOAD
15						# TODO: support a directory hierarchy of databases
16						# TODO: consider returning an object or array of objects, rather than hashes
17						# TODO: Add redis database - could be of use for Geo::Coder::Free
18						# use select() to select a database - use the table arg
19						# new(database => 'redis://servername');
20						# TODO: Add a "key" property, defaulting to "entry", which would be the name of the key
21						# TODO: The maximum number to return should be tuneable (as a LIMIT)
22						# TODO: Add full CRUD support
23						# TODO: It would be better for the default sep_char to be ',' rather than '!'
24						# FIXME: t/xml.t fails in slurping mode
25						# TODO: Other databases e.g., Redis, noSQL, remote databases such as MySQL, PostgreSQL
26						# TODO: The no_entry/entry terminology is confusing. Replace with no_id/id_column
27						# TODO: Add support for DBM::Deep
28						# TODO: Log queries and the time that they took to execute per database
29
30	16 16 16			16	2001356 15 310	use warnings;
31	16 16 16			16	27 10 135	use strict;
32
33	16 16 16			16	2356 6824 26	use boolean;
34	16 16 16			16	476 14 330	use Carp;
35	16 16 16			16	22 16 204	use Data::Dumper;
36	16 16 16			16	3084 41618 432	use Data::Reuse;
37	16 16 16			16	3311 205138 1752	use DBD::SQLite::Constants qw/:file_open/; # For SQLITE_OPEN_READONLY
38	16 16 16			16	82 16 1572	use Fcntl; # For O_RDONLY
39	16 16 16			16	36 10 207	use File::Spec;
40	16 16 16			16	2710 3532 335	use File::pfopen 0.03; # For $mode and list context
41	16 16 16			16	5333 98576 555	use File::Temp;
42	16 16 16			16	3038 430273 274	use Log::Abstraction 0.26;
43	16 16 16			16	3038 61909 250	use Object::Configure 0.16;
44	16 16 16			16	34 71 229	use Params::Get 0.13;
45						# use Error::Simple; # A nice idea to use this, but it doesn't play well with "use lib"
46	16 16 16			16	36 17 281	use Scalar::Util;
47
48						our %defaults;
49	16 16 16			16	28 29 52744	use constant DEFAULT_MAX_SLURP_SIZE => 16 * 1024; # CSV files <= than this size are read into memory
50
51 - 59						=head1 NAME Database::Abstraction - Read-only Database Abstraction Layer (ORM) =head1 VERSION Version 0.33 =cut
60
61						our $VERSION = '0.33';
62
63 - 193						=head1 DESCRIPTION C<Database::Abstraction> is a read-only database abstraction layer (ORM) for Perl, designed to provide a simple interface for accessing and querying various types of databases such as CSV, XML, and SQLite without the need to write SQL queries. It promotes code maintainability by abstracting database access logic into a single interface, allowing users to switch between different storage formats seamlessly. The module supports caching for performance optimization, flexible logging for debugging and monitoring, and includes features like the AUTOLOAD method for convenient access to database columns. By handling numerous database and file formats, C<Database::Abstraction> adds versatility and simplifies the management of read-intensive applications. =head1 SYNOPSIS Abstract class giving read-only access to CSV, XML, BerkeleyDB and SQLite databases via Perl without writing any SQL, using caching for performance optimization. The module promotes code maintainability by abstracting database access logic into a single interface. Users can switch between different storage formats without changing application logic. The ability to handle numerous database and file formats adds versatility and makes it useful for a variety of applications. It's a simple ORM like interface which, for all of its simplicity, allows you to do a lot of the heavy lifting of simple database operations without any SQL. It offers functionalities like opening the database and fetching data based on various criteria. Built-in support for flexible and configurable caching improves performance for read-intensive applications. Supports logging to debug and monitor database operations. Look for databases in $directory in this order: =over 4 =item 1 C<SQLite> File ends with .sql =item 2 C<PSV> Pipe separated file, file ends with .psv =item 3 C<CSV> File ends with .csv or .db, can be gzipped. Note the default sep_char is '!' not ',' =item 4 C<XML> File ends with .xml =item 5 C<BerkeleyDB> File ends with .db =back The AUTOLOAD feature allows for convenient access to database columns using method calls. It hides the complexity of querying the underlying data storage. If the table has a key column, entries are keyed on that and sorts are based on it. To turn that off, pass 'no_entry' to the constructor, for legacy reasons it's enabled by default. The key column's default name is 'entry', but it can be overridden by the 'id' parameter. Arrays are made read-only before being returned. To disable that, pass C<no_fixate> to the constructor. CSV files that are not no_entry can have empty lines or comment lines starting with '#', to make them more readable. =head1 EXAMPLE If the file /var/dat/foo.csv contains something like: "customer_id","name" "plugh","John" "xyzzy","Jane" Create a driver for the file in .../Database/foo.pm: package Database::foo; use Database::Abstraction; our @ISA = ('Database::Abstraction'); # Regular CSV: There is no entry column and the separators are commas sub new { my $class = shift; my %args = (ref($_[0]) eq 'HASH') ? %{$_[0]} : @_; return $class->SUPER::new(no_entry => 1, sep_char => ',', %args); } You can then use this code to access the data via the driver: # Opens the file, e.g. /var/dat/foo.csv my $foo = Database::foo->new(directory => '/var/dat'); # Prints "John" print 'Customer name ', $foo->name(customer_id => 'plugh'), "\n"; # Prints: # $VAR1 = { # 'customer_id' => 'xyzzy', # 'name' => 'Jane' # }; my $row = $foo->fetchrow_hashref(customer_id => 'xyzzy'); print Data::Dumper->new([$row])->Dump(); =head1 SUBROUTINES/METHODS =head2 init Initializes the abstraction class and its subclasses with optional arguments for configuration. Database::Abstraction::init(directory => '../data'); See the documentation for new to see what variables can be set. Returns a reference to a hash of the current values. Therefore when given with no arguments you can get the current default values: my $defaults = Database::Abstraction::init(); print $defaults->{'directory'}, "\n"; =cut
194
195						# Subroutine to initialize with args
196						sub init
197						{
198	46	100		46	465825	if(my $params = Params::Get::get_params(undef, @_)) {
199	44	100	67		454	if(($params->{'expires_in'} && !$params->{'cache_duration'})) {
200						# Compatibility with CHI
201	1				1	$params->{'cache_duration'} = $params->{'expires_in'};
202						}
203
204	44 44				38 93	%defaults = (%defaults, %{$params});
205	44		100		144	$defaults{'cache_duration'} \|\|= '1 hour';
206						}
207
208	46				875	return \%defaults
209						}
210
211 - 221						=head2 import The module can be initialised by the C<use> directive. use Database::Abstraction 'directory' => '/etc/data'; or use Database::Abstraction { 'directory' => '/etc/data' }; =cut
222
223						sub import
224						{
225	38			38	1915	my $pkg = shift;
226
227	38	50 0 0	0		65	if((scalar(@_) % 2) == 0) {
228	38				39	my %h = @_;
229	38				77	init(Object::Configure::configure($pkg, \%h));
230						} elsif((scalar(@_) == 1) && (ref($_[0]) eq 'HASH')) {
231	0				0	init(Object::Configure::configure($pkg, $_[0]));
232						} elsif(scalar(@_) > 0) { # >= 3 would also work here
233	0				0	init(\@_);
234						}
235						}
236
237 - 316						=head2 new Create an object to point to a read-only database. Arguments: Takes different argument formats (hash or positional) =over 4 =item * C<auto_load> Enable/disable the AUTOLOAD feature. The default is to have it enabled. =item * C<cache> Place to store results =item * C<cache_duration> How long to store results in the cache (default is 1 hour). =item * C<config_file> Points to a configuration file which contains the parameters to C<new()>. The file can be in any common format including C<YAML>, C<XML>, and C<INI>. This allows the parameters to be set at run time. =item * C<expires_in> Synonym of C<cache_duration>, for compatibility with C<CHI>. =item * C<dbname> The prefix of name of the database file (default is name of the table). The database will be held in a file such as $dbname.csv. =item * C<directory> Where the database file is held. If only one argument is given to C<new()>, it is taken to be C<directory>. =item * C<filename> Filename containing the data. When not given, the filename is derived from the tablename which in turn comes from the class name. =item * C<logger> Takes an optional parameter logger, which is used for warnings and traces. Can be an object that understands warn() and trace() messages, such as a L<Log::Log4perl> or L<Log::Any> object, a reference to code, or a filename. =item * C<max_slurp_size> CSV/PSV/XML files smaller than this are held in a HASH in RAM (default is 16K), falling back to SQL on larger data sets. Setting this value to 0 will turn this feature off, thus forcing SQL to be used to access the database =back If the arguments are not set, tries to take from class level defaults. Checks for abstract class usage. Slurp mode assumes that the key column (entry) is unique. If it isn't, searches will be incomplete. Turn off slurp mode on those databases, by setting a low value for max_slurp_size. Clones existing objects with or without modifications. Uses Carp::carp to log warnings for incorrect usage or potential mistakes. =cut
317
318						sub new {
319	35			35	93295	my $class = shift;
320	35				25	my %args;
321
322						# Handle hash or hashref arguments
323	35	100 100	100		98	if((scalar(@_) == 1) && !ref($_[0])) {
324	4				7	$args{'directory'} = $_[0];
325						} elsif(my $params = Params::Get::get_params(undef, @_)) {
326	24 24				275 37	%args = %{$params};
327						}
328
329	35	50 100 100			142	if(!defined($class)) {
330	0	0			0	if((scalar keys %args) > 0) {
331						# Using Database::Abstraction->new(), not Database::Abstraction::new()
332	0				0	carp(__PACKAGE__, ' use ->new() not ::new() to instantiate');
333	0				0	return;
334						}
335						# FIXME: this only works when no arguments are given
336	0				0	$class = __PACKAGE__;
337						} elsif($class eq __PACKAGE__) {
338	1				12	croak("$class: abstract class");
339						} elsif(Scalar::Util::blessed($class)) {
340						# If $class is an object, clone it with new arguments
341	2 2				1 12	return bless { %{$class}, %args }, ref($class);
342						}
343
344						# Load the configuration from a config file, if provided
345	32 32				21 45	%args = %{Object::Configure::configure($class, \%args)};
346
347						# Validate logger object has required methods
348	32	50			46988	if(defined $args{'logger'}) {
349	32	50	25		191	unless(Scalar::Util::blessed($args{'logger'}) && $args{'logger'}->can('info') && $args{'logger'}->can('error')) {
350	0				0	Carp::croak("Logger must be an object with info() and error() methods");
351						}
352						}
353
354	32	100	100		80	croak("$class: where are the files?") unless($args{'directory'} \|\| $defaults{'directory'});
355
356	31	100	67 33		183	croak("$class: ", $args{'directory'} \|\| $defaults{'directory'}, ' is not a directory') unless(-d ($args{'directory'} \|\| $defaults{'directory'}));
357
358						# init(\%args);
359
360						# return bless {
361						# logger => $args{'logger'} \|\| $logger,
362						# directory => $args{'directory'} \|\| $directory, # The directory containing the tables in XML, SQLite or CSV format
363						# cache => $args{'cache'} \|\| $cache,
364						# cache_duration => $args{'cache_duration'} \|\| $cache_duration \|\| '1 hour',
365						# table => $args{'table'}, # The name of the file containing the table, defaults to the class name
366						# no_entry => $args{'no_entry'} \|\| 0,
367						# }, $class;
368
369						# Re-seen keys take precedence, so defaults come first
370						# print STDERR ">>>>>>>>>>>>>>>>>>>\n";
371						# print STDERR __LINE__, "\n";
372						# print STDERR $args{'id'} \|\| 'undef';
373						# print STDERR "\n";
374	30				132	return bless {
375						no_entry => 0,
376						no_fixate => 0,
377						id => 'entry',
378						cache_duration => '1 hour',
379						max_slurp_size => DEFAULT_MAX_SLURP_SIZE,
380						%defaults,
381						%args,
382						}, $class;
383						}
384
385 - 389						=head2 set_logger Sets the class, code reference, or file that will be used for logging. =cut
390
391						sub set_logger
392						{
393	6			6	1820	my $self = shift;
394	6				8	my $params = Params::Get::get_params('logger', @_);
395
396	5	50			56	if(my $logger = $params->{'logger'}) {
397	5	100			6	if(Scalar::Util::blessed($logger)) {
398	2				8	$self->{'logger'} = $logger;
399						} else {
400	3				5	$self->{'logger'} = Log::Abstraction->new($logger);
401						}
402	5				50	return $self;
403						}
404	0				0	Carp::croak('Usage: set_logger(logger => $logger)')
405						}
406
407						# Open the database connection based on the specified type (e.g., SQLite, CSV).
408						# Read the data into memory or establish a connection to the database file.
409						# column_names allows the column names to be overridden on CSV files
410
411						sub _open
412						{
413	17	50		17	34	if(!UNIVERSAL::isa((caller)[0], __PACKAGE__)) {
414	0				0	Carp::croak('Illegal Operation: This method can only be called by a subclass');
415						}
416
417	17				167	my $self = shift;
418	17				31	my $params = Params::Get::get_params(undef, @_);
419
420	17	100	67		183	$params->{'sep_char'} \|\|= $self->{'sep_char'} ? $self->{'sep_char'} : '!';
421	17		67		25	my $max_slurp_size = $params->{'max_slurp_size'} \|\| $self->{'max_slurp_size'};
422
423	17		33		47	my $table = $self->{'table'} \|\| ref($self);
424	17				31	$table =~ s/.*:://;
425
426	17				46	$self->_trace(ref($self), ": _open $table");
427
428	17	50			632	return if($self->{$table});
429
430						# Read in the database
431	17				13	my $dbh;
432
433	17		33		26	my $dir = $self->{'directory'} \|\| $defaults{'directory'};
434	17		67		57	my $dbname = $self->{'dbname'} \|\| $defaults{'dbname'} \|\| $table;
435	17				109	my $slurp_file = File::Spec->catfile($dir, "$dbname.sql");
436
437	17				47	$self->_debug("_open: try to open $slurp_file");
438
439						# Look at various places to find the file and derive the file type from the file's name
440	17	50			623	if(-r $slurp_file) {
441						# SQLite file
442	0	0	0		0	require DBI && DBI->import() unless DBI->can('connect');
443
444	0				0	$dbh = DBI->connect("dbi:SQLite:dbname=$slurp_file", undef, undef, {
445						sqlite_open_flags => SQLITE_OPEN_READONLY,
446						});
447						}
448	17	50 50			113	if($dbh) {
449	0				0	$dbh->do('PRAGMA synchronous = OFF');
450	0				0	$dbh->do('PRAGMA cache_size = -4096'); # Use 4MB cache - negative = KB)
451	0				0	$dbh->do('PRAGMA journal_mode = OFF'); # Read-only, no journal needed
452	0				0	$dbh->do('PRAGMA temp_store = MEMORY'); # Store temp data in RAM
453	0				0	$dbh->do('PRAGMA mmap_size = 1048576'); # Use 1MB memory-mapped I/O
454	0				0	$dbh->sqlite_busy_timeout(100000); # 10s
455	0				0	$self->_debug("read in $table from SQLite $slurp_file");
456	0				0	$self->{'type'} = 'DBI';
457						} elsif($self->_is_berkeley_db(File::Spec->catfile($dir, "$dbname.db"))) {
458	0				0	$self->_debug("$table is a BerkeleyDB file");
459	0				0	$self->{'type'} = 'BerkeleyDB';
460						} else {
461	17				15	my $fin;
462	17				34	($fin, $slurp_file) = File::pfopen::pfopen($dir, $dbname, 'csv.gz:db.gz', '<');
463	17	50	33		475	if(defined($slurp_file) && (-r $slurp_file)) {
464	0				0	require Gzip::Faster;
465	0				0	Gzip::Faster->import();
466
467	0				0	close($fin);
468	0				0	$fin = File::Temp->new(SUFFIX => '.csv', UNLINK => 0);
469	0				0	print $fin gunzip_file($slurp_file);
470	0				0	$slurp_file = $fin->filename();
471	0				0	$self->{'temp'} = $slurp_file;
472						} else {
473	17				27	($fin, $slurp_file) = File::pfopen::pfopen($dir, $dbname, 'psv', '<');
474	17	100			338	if(defined($fin)) {
475						# Pipe separated file
476	4				6	$params->{'sep_char'} = '\|';
477						} else {
478						# CSV file
479	13				28	($fin, $slurp_file) = File::pfopen::pfopen($dir, $dbname, 'csv:db', '<');
480						}
481						}
482	17	100	67		336	if(my $filename = $self->{'filename'} \|\| $defaults{'filename'}) {
483	1				10	$self->_debug("Looking for $filename in $dir");
484	1				14	$slurp_file = File::Spec->catfile($dir, $filename);
485						}
486	17	100	67		91	if(defined($slurp_file) && (-r $slurp_file)) {
487	14	100			76	close($fin) if(defined($fin));
488	14				16	my $sep_char = $params->{'sep_char'};
489
490	14				38	$self->_debug(__LINE__, ' of ', __PACKAGE__, ": slurp_file = $slurp_file, sep_char = $sep_char");
491
492	14	100			604	if($params->{'column_names'}) {
493						$dbh = DBI->connect("dbi:CSV:db_name=$slurp_file", undef, undef,
494						{
495						csv_sep_char => $sep_char,
496						csv_tables => {
497						$table => {
498	1				5	col_names => $params->{'column_names'},
499						},
500						},
501						}
502						);
503						} else {
504	13				53	$dbh = DBI->connect("dbi:CSV:db_name=$slurp_file", undef, undef, { csv_sep_char => $sep_char});
505						}
506	14				525686	$dbh->{'RaiseError'} = 1;
507
508	14				83	$self->_debug("read in $table from CSV $slurp_file");
509
510	14				708	$dbh->{csv_tables}->{$table} = {
511						allow_loose_quotes => 1,
512						blank_is_undef => 1,
513						empty_is_undef => 1,
514						binary => 1,
515						f_file => $slurp_file,
516						escape_char => '\\',
517						sep_char => $sep_char,
518						# Don't do this, causes "Bizarre copy of HASH
519						# in scalar assignment in error_diag
520						# RT121127
521						# auto_diag => 1,
522						auto_diag => 0,
523						# Don't do this, it causes "Attempt to free unreferenced scalar"
524						# callbacks => {
525						# after_parse => sub {
526						# my ($csv, @rows) = @_;
527						# my @rc;
528						# foreach my $row(@rows) {
529						# if($row->[0] !~ /^#/) {
530						# push @rc, $row;
531						# }
532						# }
533						# return @rc;
534						# }
535						# }
536						};
537
538						# my %options = (
539						# allow_loose_quotes => 1,
540						# blank_is_undef => 1,
541						# empty_is_undef => 1,
542						# binary => 1,
543						# f_file => $slurp_file,
544						# escape_char => '\\',
545						# sep_char => $sep_char,
546						# );
547
548						# $dbh->{csv_tables}->{$table} = \%options;
549						# delete $options{f_file};
550
551						# require Text::CSV::Slurp;
552						# Text::CSV::Slurp->import();
553						# $self->{'data'} = Text::CSV::Slurp->load(file => $slurp_file, %options);
554
555						# Can't slurp when we want to use our own column names as Text::xSV::Slurp has no way to override the names
556						# FIXME: Text::xSV::Slurp can't cope well with quotes in field contents
557	14	100	100		11128	if(((-s $slurp_file) <= $max_slurp_size) && !$params->{'column_names'}) {
558	12	50			49	if((-s $slurp_file) == 0) {
559						# Empty file
560	0				0	$self->{'data'} = ();
561						} else {
562	12				1597	require Text::xSV::Slurp;
563	12				24477	Text::xSV::Slurp->import();
564
565	12				25	$self->_debug('slurp in');
566
567	12				633	my $dataref = xsv_slurp(
568						shape => 'aoh',
569						text_csv => {
570						sep_char => $sep_char,
571						allow_loose_quotes => 1,
572						blank_is_undef => 1,
573						empty_is_undef => 1,
574						binary => 1,
575						escape_char => '\\',
576						},
577						# string => \join('', grep(!/^\s*(#\|$)/, <DATA>))
578						file => $slurp_file
579						);
580
581						# Ignore blank lines or lines starting with # in the CSV file
582	12 39 60 12				2170 46 56 13	my @data = grep { $_->{$self->{'id'}} !~ /^\s*#/ } grep { defined($_->{$self->{'id'}}) } @{$dataref};
583
584	12	100			18	if($self->{'no_entry'}) {
585						# Not keyed, will need to scan each entry
586	3				8	$self->{'data'} = @data;
587						} else {
588						# keyed on the $self->{'id'} (default: "entry") column
589						# while(my $d = shift @data) {
590						# $self->{'data'}->{$d->{$self->{'id'}}} = $d;
591						# }
592						# Build hash directly from the filtered array, better to use map to avoid data copy
593						# and enclose in { } to ensure it's a hash ref
594	9 34				8 46	$self->{'data'} = { map { $_->{$self->{'id'}} => $_ } @data };
595						}
596						}
597						}
598	14				33	$self->{'type'} = 'CSV';
599						} else {
600	3				18	$slurp_file = File::Spec->catfile($dir, "$dbname.xml");
601	3	100			12	if(-r $slurp_file) {
602	2	100			6	if((-s $slurp_file) <= $max_slurp_size) {
603	1				345	require XML::Simple;
604	1				3742	XML::Simple->import();
605
606	1				26	my $xml = XMLin($slurp_file);
607	1 1				18315 2	my @keys = keys %{$xml};
608	1				1	my $key = $keys[0];
609	1				1	my @data;
610	1	50 0 0	0		2	if(ref($xml->{$key}) eq 'ARRAY') {
611	1 1				1 2	@data = @{$xml->{$key}};
612						} elsif(ref($xml) eq 'ARRAY') {
613	0 0				0 0	@data = @{$xml};
614						} elsif((ref($xml) eq 'HASH') && !$self->{'no_entry'}) {
615	0 0	0			0 0	if(scalar(keys %{$xml}) == 1) {
616	0	0			0	if($xml->{$table}) {
617	0				0	@data = $xml->{$table};
618						} else {
619	0				0	die 'TODO: import arbitrary XML with "entry" field';
620						}
621						} else {
622	0				0	die 'TODO: import arbitrary XML (differnt number of keys)';
623						}
624						} else {
625	0				0	die 'TODO: import arbitrary XML, cannot currently handle ', ref($xml);
626						}
627	1				8	$self->{'data'} = ();
628	1	50			2	if($self->{'no_entry'}) {
629						# Not keyed, will need to scan each entry
630	0				0	my $i = 0;
631	0				0	foreach my $d(@data) {
632	0				0	$self->{'data'}->{$i++} = $d;
633						}
634						} else {
635						# keyed on the $self->{'id'} (default: "entry") column
636	1				1	foreach my $d(@data) {
637	5				6	$self->{'data'}->{$d->{$self->{'id'}}} = $d;
638						}
639						}
640						} else {
641	1				3	$dbh = DBI->connect('dbi:XMLSimple(RaiseError=>1):');
642	1				85778	$dbh->{'RaiseError'} = 1;
643	1				8	$self->_debug("read in $table from XML $slurp_file");
644	1				17	$dbh->func($table, 'XML', $slurp_file, 'xmlsimple_import');
645						}
646						} else {
647						# throw Error(-file => "$dir/$table");
648	1				4	$self->_fatal("Can't find a file called '$dbname' for the table $table in $dir");
649						}
650	2				8	$self->{'type'} = 'XML';
651						}
652						}
653
654	16 0	50	67		77 0	Data::Reuse::fixate(%{$self->{'data'}}) if($self->{'data'} && (ref($self->{'data'} eq 'HASH')));
655
656	16				30	$self->{$table} = $dbh;
657	16				96	my @statb = stat($slurp_file);
658	16				23	$self->{'_updated'} = $statb[9];
659
660	16				28	return $self;
661						}
662
663 - 675						=head2 selectall_arrayref Returns a reference to an array of hash references of all the data meeting the given criteria. Note that since this returns an array ref, optimisations such as "LIMIT 1" will not be used. Use caching if that is available. Returns undef if there are no matches. =cut
676
677						sub selectall_arrayref {
678	8			8	641	my $self = shift;
679	8				5	my $params;
680
681	8	100 100			18	if($self->{'no_entry'}) {
682	1				3	$params = Params::Get::get_params(undef, \@_);
683						} elsif(scalar(@_)) {
684	4				6	$params = Params::Get::get_params('entry', @_);
685						}
686
687	8	50			54	if($self->{'berkeley'}) {
688	0				0	$self->_fatal(ref($self), ': selectall_arrayref is meaningless on a NoSQL database');
689						}
690
691	8				13	my $table = $self->_open_table($params);
692
693	8	100			12	if($self->{'data'}) {
694	4 4	100 100	50		2 5	if(scalar(keys %{$params}) == 0) {
695	2				6	$self->_trace("$table: selectall_arrayref fast track return");
696	2	50			29	if(ref($self->{'data'}) eq 'HASH') {
697						# $self->{'data'} looks like this:
698						# key1 => {
699						# entry => key1,
700						# field1 => value1,
701						# field2 => value2
702						# }, key2 => {
703						# entry => key2,
704						# field1 => valuea,
705						# field2 => valueb
706						# }
707	2 2				2 5	$self->_debug("$table: returning ", scalar keys %{$self->{'data'}}, ' entries');
708	2 2	50			22 2	if(scalar keys %{$self->{'data'}} <= 10) {
709	2				6	$self->_debug(Dumper($self->{'data'}));
710						}
711	2				21	my @rc;
712	2 2				2 3	foreach my $k (keys %{$self->{'data'}}) {
713	9				8	push @rc, $self->{'data'}->{$k};
714						}
715	2				4	return Return::Set::set_return(\@rc, { type => 'arrayref' });
716						}
717	0				0	return Return::Set::set_return($self->{'data'}, { type => 'arrayref'});
718						# my @rc = values %{$self->{'data'}};
719						# return @rc;
720	2				4	} elsif((scalar(keys %{$params}) == 1) && defined($params->{'entry'}) && !$self->{'no_entry'}) {
721	1				3	return Return::Set::set_return([$self->{'data'}->{$params->{'entry'}}], { type => 'arrayref' });
722						}
723						}
724
725	5				4	my $query;
726	5				3	my $done_where = 0;
727
728	5	100	100		14	if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
729	3				4	$query = "SELECT * FROM $table WHERE entry IS NOT NULL AND entry NOT LIKE '#%'";
730	3				3	$done_where = 1;
731						} else {
732	2				3	$query = "SELECT * FROM $table";
733						}
734
735	5				2	my @query_args;
736	5 5				4 9	foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
737	3				2	my $arg = $params->{$c1};
738	3	50			4	if(ref($arg)) {
739						# throw Error::Simple("$query: argument is not a string: " . ref($arg));
740	0				0	$self->_fatal("selectall_arrayref(): $query: argument is not a string: ", ref($arg));
741						}
742	3	50			3	if(!defined($arg)) {
743	0				0	my @call_details = caller(0);
744						# throw Error::Simple("$query: value for $c1 is not defined in call from " .
745						# $call_details[2] . ' of ' . $call_details[1]);
746	0				0	Carp::croak("$query: value for $c1 is not defined in call from ",
747						$call_details[2], ' of ', $call_details[1]);
748						}
749
750	3				1	my $keyword;
751	3	50			4	if($done_where) {
752	3				2	$keyword = 'AND';
753						} else {
754	0				0	$keyword = 'WHERE';
755	0				0	$done_where = 1;
756						}
757	3	50			4	if($arg =~ /[%_]/) {
758	0				0	$query .= " $keyword $c1 LIKE ?";
759						} else {
760	3				3	$query .= " $keyword $c1 = ?";
761						}
762	3				3	push @query_args, $arg;
763						}
764	5	100			8	if(!$self->{no_entry}) {
765	4				4	$query .= ' ORDER BY ' . $self->{'id'};
766						}
767
768	5	100			13	if(defined($query_args[0])) {
769	3				5	$self->_debug("selectall_arrayref $query: ", join(', ', @query_args));
770						} else {
771	2				4	$self->_debug("selectall_arrayref $query");
772						}
773
774	5				95	my $key;
775						my $c;
776	5	50			7	if($c = $self->{cache}) {
777	0				0	$key = ref($self) . "::$query array";
778	0	0			0	if(defined($query_args[0])) {
779	0				0	$key .= ' ' . join(', ', @query_args);
780						}
781	0	0			0	if(my $rc = $c->get($key)) {
782	0				0	$self->_debug('cache HIT');
783	0				0	return $rc; # We stored a ref to the array
784
785						# This use of a temporary variable is to avoid
786						# "Implicit scalar context for array in return"
787						# my @rc = @{$rc};
788						# return @rc;
789						}
790	0				0	$self->_debug('cache MISS');
791						} else {
792	5				5	$self->_debug('cache not used');
793						}
794
795	5	50			65	if(my $sth = $self->{$table}->prepare($query)) {
796	5	50			5800	$sth->execute(@query_args) \|\|
797						# throw Error::Simple("$query: @query_args");
798						croak("$query: @query_args");
799
800	5				9031	my $rc;
801	5				33	while(my $href = $sth->fetchrow_hashref()) {
802	27 27 27	50			678 53 28	push @{$rc}, $href if(scalar keys %{$href});
803						}
804	5	50			110	if($c) {
805	0				0	$c->set($key, $rc, $self->{'cache_duration'}); # Store a ref to the array
806						}
807
808	5 5	50			8 10	Data::Reuse::fixate(@{$rc}) if(!$self->{'no_fixate'});
809
810	5				856	return $rc;
811						}
812	0				0	$self->_warn("selectall_array failure on $query: @query_args");
813						# throw Error::Simple("$query: @query_args");
814	0				0	croak("$query: @query_args");
815
816						# my @rc = grep { defined $_ } $self->selectall_array(@_);
817
818						# return if(scalar(@rc) == 0);
819
820						# Data::Reuse::fixate(@rc) if(!$self->{'no_fixate'});
821						# return \@rc;
822						}
823
824 - 828						=head2 selectall_hashref Deprecated misleading legacy name for selectall_arrayref. =cut
829
830						sub selectall_hashref
831						{
832	6			6	1145	my $self = shift;
833	6				11	return $self->selectall_arrayref(@_);
834						}
835
836 - 845						=head2 selectall_array Similar to selectall_array but returns an array of hash references. Con: Copies more data around than selectall_arrayref Pro: Better determination of list vs scalar mode than selectall_arrayref by setting "LIMIT 1" TODO: Remove duplicated code =cut
846
847						sub selectall_array
848						{
849	7			7	5	my $self = shift;
850
851	7	50			12	if($self->{'berkeley'}) {
852	0				0	Carp::croak(ref($self), ': selectall_array is meaningless on a NoSQL database');
853						}
854
855	7				13	my $params = Params::Get::get_params(undef, \@_);
856	7				72	my $table = $self->_open_table($params);
857
858	7	100			12	if($self->{'data'}) {
859	4 4	100 50	25		2 5	if(scalar(keys %{$params}) == 0) {
860	3				6	$self->_trace("$table: selectall_array fast track return");
861	3	50			40	if(ref($self->{'data'}) eq 'HASH') {
862	3 3				3 7	return values %{$self->{'data'}};
863						}
864	0 0				0 0	return @{$self->{'data'}};
865						# my @rc = values %{$self->{'data'}};
866						# return @rc;
867	1				4	} elsif((scalar(keys %{$params}) == 1) && defined($params->{'entry'}) && !$self->{'no_entry'}) {
868	0				0	return $self->{'data'}->{$params->{'entry'}};
869						}
870						}
871
872	4				1	my $query;
873	4				4	my $done_where = 0;
874
875	4	100	100		11	if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
876	2				2	$query = "SELECT * FROM $table WHERE entry IS NOT NULL AND entry NOT LIKE '#%'";
877	2				1	$done_where = 1;
878						} else {
879	2				1	$query = "SELECT * FROM $table";
880						}
881
882	4				4	my @query_args;
883	4 4				2 9	foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
884	3				3	my $arg = $params->{$c1};
885	3	50			4	if(ref($arg)) {
886						# throw Error::Simple("$query: argument is not a string: " . ref($arg));
887	0				0	$self->_fatal("selectall_array(): $query: argument is not a string: ", ref($arg));
888						}
889
890	3				2	my $keyword;
891	3	100			4	if($done_where) {
892	2				11	$keyword = 'AND';
893						} else {
894	1				1	$keyword = 'WHERE';
895	1				1	$done_where = 1;
896						}
897	3	100			3	if(!defined($arg)) {
898	1				2	$query .= " $keyword $c1 IS NULL"
899						} else {
900	2	50			4	if($arg =~ /[%_]/) {
901	0				0	$query .= " $keyword $c1 LIKE ?";
902						} else {
903	2				2	$query .= " $keyword $c1 = ?";
904						}
905	2				2	push @query_args, $arg;
906						}
907						}
908	4	100			8	if(!$self->{no_entry}) {
909	3				4	$query .= ' ORDER BY ' . $self->{'id'};
910						}
911	4	50			5	if(!wantarray) {
912	0				0	$query .= ' LIMIT 1';
913						}
914
915	4	100			4	if(defined($query_args[0])) {
916	2				6	$self->_debug("selectall_array $query: ", join(', ', @query_args));
917						} else {
918	2				4	$self->_debug("selectall_array $query");
919						}
920
921	4				64	my $key;
922						my $c;
923	4	50			19	if($c = $self->{cache}) {
924	0				0	$key = ref($self) . '::' . $query;
925	0	0			0	if(wantarray) {
926	0				0	$key .= ' array';
927						}
928	0	0			0	if(defined($query_args[0])) {
929	0				0	$key .= ' ' . join(', ', @query_args);
930						}
931	0	0			0	if(my $rc = $c->get($key)) {
932	0				0	$self->_debug('cache HIT');
933	0 0	0			0 0	return wantarray ? @{$rc} : $rc; # We stored a ref to the array
934
935						# This use of a temporary variable is to avoid
936						# "Implicit scalar context for array in return"
937						# my @rc = @{$rc};
938						# return @rc;
939						}
940	0				0	$self->_debug('cache MISS');
941						} else {
942	4				7	$self->_debug('cache not used');
943						}
944
945	4	50			67	if(my $sth = $self->{$table}->prepare($query)) {
946	4	50			4316	$sth->execute(@query_args) \|\|
947						# throw Error::Simple("$query: @query_args");
948						croak("$query: @query_args");
949
950	4				7042	my $rc;
951	4				18	while(my $href = $sth->fetchrow_hashref()) {
952	22	50			527	return $href if(!wantarray); # FIXME: Doesn't store in the cache
953	22 22				13 42	push @{$rc}, $href;
954						}
955	4	50			78	if($c) {
956	0				0	$c->set($key, $rc, $self->{'cache_duration'}); # Store a ref to the array
957						}
958
959	4	50			7	if($rc) {
960	4 4	50			8 8	Data::Reuse::fixate(@{$rc}) if(!$self->{'no_fixate'});
961	4 4				675 18	return @{$rc};
962						}
963	0				0	return;
964						}
965	0				0	$self->_warn("selectall_array failure on $query: @query_args");
966						# throw Error::Simple("$query: @query_args");
967	0				0	croak("$query: @query_args");
968						}
969
970 - 974						=head2 selectall_hash Deprecated misleading legacy name for selectall_array. =cut
975
976						sub selectall_hash
977						{
978	7			7	1618	my $self = shift;
979	7				19	return $self->selectall_array(@_);
980						}
981
982 - 986						=head2 count Return the number items/rows matching the given criteria =cut
987
988						sub count
989						{
990	5			5	745	my $self = shift;
991
992	5	50			8	if($self->{'berkeley'}) {
993	0				0	Carp::croak(ref($self), ': count is meaningless on a NoSQL database');
994						}
995
996	5				10	my $params = Params::Get::get_params(undef, \@_);
997	5				56	my $table = $self->_open_table($params);
998
999	5	100			7	if($self->{'data'}) {
1000	3 3	100 50	25		3 3	if(scalar(keys %{$params}) == 0) {
1001	2				4	$self->_trace("$table: count fast track return");
1002	2	50			24	if(ref($self->{'data'}) eq 'HASH') {
1003	2 2				2 6	return scalar keys %{$self->{'data'}};
1004						}
1005	0 0				0 0	return scalar @{$self->{'data'}};
1006	1				3	} elsif((scalar(keys %{$params}) == 1) && defined($params->{'entry'}) && !$self->{'no_entry'}) {
1007	0	0			0	return $self->{'data'}->{$params->{'entry'}} ? 1 : 0;
1008						}
1009						}
1010
1011	3				3	my $query;
1012	3				2	my $done_where = 0;
1013
1014	3	100 50	67		10	if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1015	1				1	$query = "SELECT COUNT(*) FROM $table WHERE entry IS NOT NULL AND entry NOT LIKE '#%'";
1016	1				1	$done_where = 1;
1017						} elsif($self->{no_entry}) {
1018	2				2	$query = "SELECT COUNT(*) FROM $table";
1019						} else {
1020	0				0	$query = "SELECT COUNT(entry) FROM $table";
1021						}
1022
1023	3				3	my @query_args;
1024	3 3				3 5	foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
1025	1				1	my $arg = $params->{$c1};
1026	1	50			1	if(ref($arg)) {
1027						# throw Error::Simple("$query: argument is not a string: " . ref($arg));
1028	0				0	$self->_fatal("count(): $query: argument is not a string: ", ref($arg));
1029						}
1030	1	50			2	if(!defined($arg)) {
1031	0				0	my @call_details = caller(0);
1032						# throw Error::Simple("$query: value for $c1 is not defined in call from " .
1033						# $call_details[2] . ' of ' . $call_details[1]);
1034	0				0	Carp::croak("$query: value for $c1 is not defined in call from ",
1035						$call_details[2], ' of ', $call_details[1]);
1036						}
1037
1038	1				0	my $keyword;
1039	1	50			1	if($done_where) {
1040	1				2	$keyword = 'AND';
1041						} else {
1042	0				0	$keyword = 'WHERE';
1043	0				0	$done_where = 1;
1044						}
1045	1	50			1	if($arg =~ /[%_]/) {
1046	0				0	$query .= " $keyword $c1 LIKE ?";
1047						} else {
1048	1				5	$query .= " $keyword $c1 = ?";
1049						}
1050	1				2	push @query_args, $arg;
1051						}
1052	3	100			5	if(!$self->{no_entry}) {
1053	1				2	$query .= ' ORDER BY ' . $self->{'id'};
1054						}
1055
1056	3	100			3	if(defined($query_args[0])) {
1057	1				2	$self->_debug("count $query: ", join(', ', @query_args));
1058						} else {
1059	2				4	$self->_debug("count $query");
1060						}
1061
1062	3				43	my $key;
1063						my $c;
1064	3	50			4	if($c = $self->{cache}) {
1065	0				0	$key = ref($self) . '::' . $query;
1066	0				0	$key =~ s/COUNT$(.+?)$/$1/;
1067	0				0	$key .= ' array';
1068	0	0			0	if(defined($query_args[0])) {
1069	0				0	$key .= ' ' . join(', ', @query_args);
1070						}
1071	0	0			0	if(my $rc = $c->get($key)) {
1072						# Unlikely
1073	0				0	$self->_debug('cache HIT');
1074	0 0				0 0	return scalar @{$rc}; # We stored a ref to the array
1075						}
1076	0				0	$self->_debug('cache MISS');
1077						} else {
1078	3				3	$self->_debug('cache not used');
1079						}
1080
1081	3	50			42	if(my $sth = $self->{$table}->prepare($query)) {
1082	3	50			1755	$sth->execute(@query_args) \|\|
1083						# throw Error::Simple("$query: @query_args");
1084						croak("$query: @query_args");
1085
1086	3				3779	my $count = $sth->fetchrow_arrayref()->[0];
1087
1088	3				64	$sth->finish();
1089
1090	3				16	return $count;
1091						}
1092	0				0	$self->_warn("count failure on $query: @query_args");
1093						# throw Error::Simple("$query: @query_args");
1094	0				0	croak("$query: @query_args");
1095						}
1096
1097 - 1110						=head2 fetchrow_hashref Returns a hash reference for a single row in a table. It searches for the given arguments, searching IS NULL if the value is C<undef> my $res = $foo->fetchrow_hashref(entry => 'one'); Special argument: table: determines the table to read from if not the default, which is worked out from the class name When no_entry is not set allow just one argument to be given: the entry value. =cut
1111
1112						sub fetchrow_hashref {
1113	8			8	1134	my $self = shift;
1114
1115	8				20	$self->_trace('Entering fetchrow_hashref');
1116
1117	8				106	my $params;
1118
1119	8	100			12	if(!$self->{'no_entry'}) {
1120	7				9	$params = Params::Get::get_params('entry', @_);
1121						} else {
1122	1				4	$params = Params::Get::get_params(undef, @_);
1123						}
1124
1125	8				90	my $table = $self->_open_table($params);
1126
1127						# ::diag($self->{'type'});
1128	8 7	100	60		18 16	if($self->{'data'} && (!$self->{'no_entry'}) && (scalar keys(%{$params}) == 1) && defined($params->{'entry'})) {
1129	4				5	$self->_debug('Fast return from slurped data');
1130	4				64	return $self->{'data'}->{$params->{'entry'}};
1131						}
1132
1133	4	50			7	if($self->{'berkeley'}) {
1134						# print STDERR ">>>>>>>>>>>>\n";
1135						# ::diag(Data::Dumper->new([$self->{'berkeley'}])->Dump());
1136	0 0	0	0		0 0	if((!$self->{'no_entry'}) && (scalar keys(%{$params}) == 1) && defined($params->{'entry'})) {
1137	0				0	return { entry => $self->{'berkeley'}->{$params->{'entry'}} };
1138						}
1139	0				0	my $id = $self->{'id'};
1140	0 0	0	0		0 0	if($self->{'no_entry'} && (scalar keys(%{$params}) == 1) && defined($id) && defined($params->{$id})) {
1141	0	0			0	if(my $rc = $self->{'berkeley'}->{$params->{$id}}) {
1142	0				0	return { $params->{$id} => $rc } # Return key->value as a hash pair
1143						}
1144	0				0	return;
1145						}
1146	0				0	Carp::croak(ref($self), ': fetchrow_hashref is meaningless on a NoSQL database');
1147						}
1148
1149	4				5	my $query = 'SELECT * FROM ';
1150	4	50			6	if(my $t = delete $params->{'table'}) {
1151	0				0	$query .= $t;
1152						} else {
1153	4				5	$query .= $table;
1154						}
1155	4				3	my $done_where = 0;
1156
1157	4	100	67		12	if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1158	3				5	$query .= ' WHERE ' . $self->{'id'} . ' IS NOT NULL AND ' . $self->{'id'} . " NOT LIKE '#%'";
1159	3				2	$done_where = 1;
1160						}
1161	4				4	my @query_args;
1162	4 4				2 6	foreach my $c1(sort keys(%{$params})) { # sort so that the key is always the same
1163	4				3	my $keyword;
1164
1165	4	100			5	if($done_where) {
1166	3				1	$keyword = 'AND';
1167						} else {
1168	1				1	$keyword = 'WHERE';
1169	1				0	$done_where = 1;
1170						}
1171	4	100			7	if(my $arg = $params->{$c1}) {
1172	3	50			2	if(ref($arg)) {
1173						# throw Error::Simple("$query: argument is not a string: " . ref($arg));
1174	0				0	$self->_fatal("fetchrow_hash(): $query: argument is not a string: ", ref($arg));
1175						}
1176
1177	3	50			6	if($arg =~ /[%_]/) {
1178	0				0	$query .= " $keyword $c1 LIKE ?";
1179						} else {
1180	3				3	$query .= " $keyword $c1 = ?";
1181						}
1182	3				5	push @query_args, $arg;
1183						} else {
1184	1				2	$query .= " $keyword $c1 IS NULL";
1185						# my @call_details = caller(0);
1186						# # throw Error::Simple("$query: value for $c1 is not defined in call from " .
1187						# # $call_details[2] . ' of ' . $call_details[1]);
1188						# Carp::croak("$query: value for $c1 is not defined in call from ",
1189						# $call_details[2], ' of ', $call_details[1]);
1190						}
1191						}
1192						# $query .= ' ORDER BY entry LIMIT 1';
1193	4				5	$query .= ' LIMIT 1';
1194	4	100			6	if(defined($query_args[0])) {
1195	3				5	my @call_details = caller(0);
1196	3				37	$self->_debug("fetchrow_hashref $query: ", join(', ', @query_args),
1197						' called from ', $call_details[2], ' of ', $call_details[1]);
1198						} else {
1199	1				2	$self->_debug("fetchrow_hashref $query");
1200						}
1201	4				56	my $key = ref($self) . '::';
1202	4	100			6	if(defined($query_args[0])) {
1203	3	50			5	if(wantarray) {
1204	0				0	$key .= 'array ';
1205						}
1206	3				6	$key .= "fetchrow $query " . join(', ', @query_args);
1207						} else {
1208	1				1	$key .= "fetchrow $query";
1209						}
1210	4				4	my $c;
1211	4	50			5	if($c = $self->{cache}) {
1212	0	0			0	if(my $rc = $c->get($key)) {
1213	0	0			0	if(wantarray) {
1214	0	0			0	if(ref($rc) eq 'ARRAY') {
1215	0 0				0 0	return @{$rc}; # We stored a ref to the array
1216						}
1217						} else {
1218	0				0	return $rc;
1219						}
1220						}
1221						}
1222
1223	4	50			23	my $sth = $self->{$table}->prepare($query) or die $self->{$table}->errstr();
1224						# $sth->execute(@query_args) \|\| throw Error::Simple("$query: @query_args");
1225	4	50			10655	$sth->execute(@query_args) \|\| croak("$query: @query_args");
1226	4				4648	my $rc = $sth->fetchrow_hashref();
1227	4	50			162	if($c) {
1228	0	0			0	if($rc) {
1229	0				0	$self->_debug("stash $key=>$rc in the cache for ", $self->{'cache_duration'});
1230	0				0	$self->_debug("returns ", Data::Dumper->new([$rc])->Dump());
1231						} else {
1232	0				0	$self->_debug("Stash $key=>undef in the cache for ", $self->{'cache_duration'});
1233						}
1234	0				0	$c->set($key, $rc, $self->{'cache_duration'});
1235						}
1236	4				24	return $rc;
1237						}
1238
1239 - 1252						=head2 execute Execute the given SQL query on the database. In an array context, returns an array of hash refs, in a scalar context returns a hash of the first row On CSV tables without no_entry, it may help to add "WHERE entry IS NOT NULL AND entry NOT LIKE '#%'" to the query. If the data have been slurped, this will still work by accessing that actual database. =cut
1253
1254						sub execute
1255						{
1256	2			2	925	my $self = shift;
1257
1258	2	50			4	if($self->{'berkeley'}) {
1259	0				0	Carp::croak(ref($self), ': execute is meaningless on a NoSQL database');
1260						}
1261
1262	2				7	my $args = Params::Get::get_params('query', @_);
1263						# Ensure the 'query' parameter is provided
1264						Carp::croak(__PACKAGE__, ': Usage: execute(query => $query)')
1265	2	50			21	unless defined $args->{'query'};
1266
1267	2				3	my $table = $self->_open_table($args);
1268
1269	2				3	my $query = $args->{'query'};
1270
1271						# Append "FROM <table>" if missing
1272	2	50			11	$query .= " FROM $table" unless $query =~ /\sFROM\s/i;
1273
1274						# Log the query if a logger is available
1275	2				5	$self->_debug("execute $query");
1276
1277						# Prepare and execute the query
1278	2				32	my $sth = $self->{$table}->prepare($query);
1279	2	50			1287	$sth->execute() or croak($query); # Die with the query in case of error
1280
1281						# Fetch the results
1282	2				1732	my @results;
1283	2				22	while (my $row = $sth->fetchrow_hashref()) {
1284						# Return a single hashref if scalar context is expected
1285	6	50			150	return $row unless wantarray;
1286	6				13	push @results, $row;
1287						}
1288
1289						# Return all rows as an array in list context
1290	2				41	return @results;
1291						}
1292
1293 - 1297						=head2 updated Returns the timestamp of the last database update. =cut
1298
1299						sub updated {
1300	1			1	384	my $self = shift;
1301
1302	1				2	return $self->{'_updated'};
1303						}
1304
1305 - 1326						=head2 AUTOLOAD Directly access a database column. Returns all entries in a column, a single entry based on criteria. Uses cached data if available. Returns an array of the matches, or only the first when called in scalar context If the database has a column called "entry" you can do a quick lookup with my $value = $foo->column('123'); # where "column" is the value you're after my @entries = $foo->entry(); print 'There are ', scalar(@entries), " entries in the database\n"; Set distinct or unique to 1 if you're after a unique list. Throws an error in slurp mode when an invalid column name is given. =cut
1327
1328						sub AUTOLOAD {
1329	30			30	6167	our $AUTOLOAD;
1330	30				96	my ($column) = $AUTOLOAD =~ /::(\w+)$/;
1331
1332	30	50			43	return if($column eq 'DESTROY');
1333
1334	30	50			44	my $self = shift or return;
1335
1336	30	50			30	Carp::croak(__PACKAGE__, ": Unknown column $column") if(!ref($self));
1337
1338						# Allow the AUTOLOAD feature to be disabled
1339	30	50	33		42	Carp::croak(__PACKAGE__, ": Unknown column $column") if(exists($self->{'auto_load'}) && boolean($self->{'auto_load'})->isFalse());
1340
1341						# Validate column name - only allow safe column name
1342	30	50			54	Carp::croak(__PACKAGE__, ": Invalid column name: $column") unless $column =~ /^[a-zA-Z_][a-zA-Z0-9_]*$/;
1343
1344	30				44	my $table = $self->_open_table();
1345
1346	29				24	my %params;
1347	29	100 100 50			62	if(ref($_[0]) eq 'HASH') {
1348	2 2				2 4	%params = %{$_[0]};
1349						} elsif((scalar(@_) % 2) == 0) {
1350	11				14	%params = @_;
1351						} elsif(scalar(@_) == 1) {
1352						# Don't error on key-value databases, since there's no idea of columns
1353	16	50	33		21	if($self->{'no_entry'} && !$self->{'berkeley'}) {
1354	0				0	Carp::croak(ref($self), "::($_[0]): ", $self->{'id'}, ' is not a column');
1355						}
1356	16				20	$params{'entry'} = shift;
1357						}
1358
1359	29	50			34	if($self->{'berkeley'}) {
1360	0	0			0	if(my $id = $self->{'id'}) {
1361	0				0	return $self->{'berkeley'}->{$params{$id}};
1362						}
1363	0				0	return $self->{'berkeley'}->{$params{'entry'}};
1364						}
1365
1366	29	50			31	croak('Where did the data come from?') if(!defined($self->{'type'}));
1367	29				16	my $query;
1368	29				21	my $done_where = 0;
1369	29		100		53	my $distinct = delete($params{'distinct'}) \|\| delete($params{'unique'});
1370
1371	29	100	100		45	if(wantarray && !$distinct) {
1372	5	100	100		12	if(((scalar keys %params) == 0) && (my $data = $self->{'data'})) {
1373						# Return all the entries in the column
1374	3 12 3				3 15 6	return map { $_->{$column} } values %{$data};
1375						}
1376	2	100	67		6	if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1377	1				1	$query = "SELECT $column FROM $table WHERE " . $self->{'id'} . " IS NOT NULL AND entry NOT LIKE '#%'";
1378	1				1	$done_where = 1;
1379						} else {
1380	1				1	$query = "SELECT $column FROM $table";
1381						}
1382						} else {
1383	24	100			28	if(my $data = $self->{'data'}) {
1384						# The data has been read in using Text::xSV::Slurp,
1385						# so no need to do any SQL
1386	18				22	$self->_debug('AUTOLOAD using slurped data');
1387	18	50 100 100	100		348	if($self->{'no_entry'}) {
1388	0				0	$self->_debug('no_entry is set');
1389	0				0	my ($key, $value) = %params;
1390	0	0			0	if(defined($key)) {
1391	0				0	$self->_debug("key = $key, value = $value, column = $column");
1392	0 0				0 0	foreach my $row(@{$data}) {
1393	0	0	0		0	if(defined($row->{$key}) && ($row->{$key} eq $value) && (my $rc = $row->{$column})) {
1394	0	0			0	if(defined($rc)) {
1395	0				0	$self->_trace(__LINE__, ": AUTOLOAD $key: return '$rc' from slurped data");
1396						} else {
1397	0				0	$self->_trace(__LINE__, ": AUTOLOAD $key: return undef from slurped data");
1398						}
1399	0				0	return $rc
1400						}
1401						}
1402	0				0	$self->_debug('not found in slurped data');
1403						}
1404						} elsif(((scalar keys %params) == 1) && defined(my $key = $params{'entry'})) {
1405						# Look up the key
1406
1407						# This weird code is to stop the data hash becoming polluted with empty
1408						# values as we look things up
1409						# my $rc = $data->{$key}->{$column};
1410	14				10	my $rc;
1411	14	100			16	if(defined(my $hash = $data->{$key})) {
1412						# Look up the key
1413	10	100			10	if(!exists($hash->{$column})) {
1414	1				7	Carp::croak(__PACKAGE__, ": There is no column $column in $table");
1415						}
1416	9				9	$rc = $hash->{$column};
1417						}
1418	13	100			12	if(defined($rc)) {
1419	8				39	$self->_trace(__LINE__, ": AUTOLOAD $key: return '$rc' from slurped data");
1420						} else {
1421	5				11	$self->_trace(__LINE__, ": AUTOLOAD $key: return undef from slurped data");
1422						}
1423	13				211	return $rc
1424						} elsif((scalar keys %params) == 0) {
1425	2	50			2	if(wantarray) {
1426	2	50			4	if($distinct) {
1427						# https://stackoverflow.com/questions/7651/how-do-i-remove-duplicate-items-from-an-array-in-perl
1428	2 7 7 2				2 6 7 2	my %h = map { $_, 1 } map { $_->{$column} } values %{$data};
1429	2				6	return keys %h;
1430						}
1431	0 0 0				0 0 0	return map { $_->{$column} } values %{$data}
1432						}
1433						# FIXME - this works but really isn't the right way to do it
1434	0 0				0 0	foreach my $v (values %{$data}) {
1435	0				0	return $v->{$column}
1436						}
1437						} else {
1438						# It's keyed, but we're not querying off it
1439	2				3	my ($key, $value) = %params;
1440	2 2				3 2	foreach my $row (values %{$data}) {
1441	7	100	75		17	if(defined($row->{$key}) && ($row->{$key} eq $value) && (my $rc = $row->{$column})) {
1442	2	50			2	if(defined($rc)) {
1443	2				4	$self->_trace(__LINE__, ": AUTOLOAD $key: return '$rc' from slurped data");
1444						} else {
1445	0				0	$self->_trace(__LINE__, ": AUTOLOAD $key: return undef from slurped data");
1446						}
1447	2				43	return $rc
1448						}
1449						}
1450						}
1451						return
1452	0				0	}
1453						# Data has not been slurped in
1454	6	50	67		14	if(($self->{'type'} eq 'CSV') && !$self->{no_entry}) {
1455	0				0	$query = "SELECT DISTINCT $column FROM $table WHERE " . $self->{'id'} . " IS NOT NULL AND entry NOT LIKE '#%'";
1456	0				0	$done_where = 1;
1457						} else {
1458	6				7	$query = "SELECT DISTINCT $column FROM $table";
1459						}
1460						}
1461	8				6	my @args;
1462	8				14	while(my ($key, $value) = each %params) {
1463	6				14	$self->_debug(__PACKAGE__, ": AUTOLOAD adding key/value pair $key=>$value");
1464	6	50			361	if(defined($value)) {
1465	6	100			7	if($done_where) {
1466	1				1	$query .= " AND $key = ?";
1467						} else {
1468	5				5	$query .= " WHERE $key = ?";
1469	5				5	$done_where = 1;
1470						}
1471	6				13	push @args, $value;
1472						} else {
1473	0				0	$self->_debug("AUTOLOAD params $key isn't defined");
1474	0	0			0	if($done_where) {
1475	0				0	$query .= " AND $key IS NULL";
1476						} else {
1477	0				0	$query .= " WHERE $key IS NULL";
1478	0				0	$done_where = 1;
1479						}
1480						}
1481						}
1482	8	100			9	if(wantarray) {
1483	3				4	$query .= " ORDER BY $column";
1484						} else {
1485	5				4	$query .= ' LIMIT 1';
1486						}
1487	8	100	67		14	if(scalar(@args) && $args[0]) {
1488	6				17	$self->_debug("AUTOLOAD $query: ", join(', ', @args));
1489						} else {
1490	2				3	$self->_debug("AUTOLOAD $query");
1491						}
1492	8				363	my $cache;
1493	8				9	my $key = ref($self) . '::';
1494	8	50			15	if($cache = $self->{cache}) {
1495	0	0			0	if(wantarray) {
1496	0				0	$key .= 'array ';
1497						}
1498	0	0			0	if(defined($args[0])) {
1499	0				0	$key .= "fetchrow $query " . join(', ', @args);
1500						} else {
1501	0				0	$key .= "fetchrow $query";
1502						}
1503	0	0			0	if(my $rc = $cache->get($key)) {
1504	0				0	$self->_debug('cache HIT');
1505	0 0	0			0 0	return wantarray ? @{$rc} : $rc; # We stored a ref to the array
1506						}
1507	0				0	$self->_debug('cache MISS');
1508						} else {
1509	8				10	$self->_debug('cache not used');
1510						}
1511						# my $sth = $self->{$table}->prepare($query) \|\| throw Error::Simple($query);
1512	8		33		380	my $sth = $self->{$table}->prepare($query) \|\| croak($query);
1513						# $sth->execute(@args) \|\| throw Error::Simple($query);
1514	8	50			13743	$sth->execute(@args) \|\| croak($query);
1515
1516	8	100			17925	if(wantarray) {
1517	3 10 3				3 191 11	my @rc = map { $_->[0] } @{$sth->fetchall_arrayref()};
1518	3	50			13	if($cache) {
1519	0				0	$cache->set($key, \@rc, $self->{'cache_duration'}); # Store a ref to the array
1520						}
1521	3	50			9	Data::Reuse::fixate(@rc) if(!$self->{'no_fixate'});
1522	3				176	return @rc;
1523						}
1524	5				27	my $rc = $sth->fetchrow_array(); # Return the first match only
1525	5				83	$sth->finish();
1526	5	50			20	if($cache) {
1527	0				0	return $cache->set($key, $rc, $self->{'cache_duration'});
1528						}
1529	5				16	return $rc;
1530						}
1531
1532						sub DESTROY
1533						{
1534	31	50	33	31	9355	if(defined($^V) && ($^V ge 'v5.14.0')) {
1535	31	50			59	return if ${^GLOBAL_PHASE} eq 'DESTRUCT'; # >= 5.14.0 only
1536						}
1537	31				48	my $self = shift;
1538
1539						# Clean up temporary file
1540	31	50			45	if($self->{'temp'}) {
1541	0				0	unlink delete $self->{'temp'};
1542						}
1543
1544						# Clean up database handles
1545	31		33		58	my $table_name = $self->{'table'} \|\| ref($self);
1546	31				79	$table_name =~ s/.*:://;
1547
1548	31	100			45	if(my $dbh = delete $self->{$table_name}) {
1549	14	50			108	$dbh->disconnect() if $dbh->can('disconnect');
1550	14	50			295	$dbh->finish() if $dbh->can('finish');
1551						}
1552
1553						# Clean up Berkeley DB
1554	31	50			42	if($self->{'berkeley'}) {
1555	0				0	eval {
1556	0 0				0 0	untie %{$self->{'berkeley'}};
1557						};
1558	0				0	delete $self->{'berkeley'};
1559						}
1560
1561						# Clear all other attributes to break potential circular references
1562	31				46	foreach my $key (keys %$self) {
1563	291				673	delete $self->{$key};
1564						}
1565						}
1566
1567						# Determine the table and open the database
1568						sub _open_table
1569						{
1570	60			60	56	my($self, $params) = @_;
1571
1572						# Get table name (remove package name prefix if present)
1573	60		33		159	my $table = $params->{'table'} \|\| $self->{'table'} \|\| ref($self);
1574	60				121	$table =~ s/.*:://;
1575
1576						# Open a connection if it's not already open
1577	60	50	67		121	$self->_open() if((!$self->{$table}) && (!$self->{'data'}));
1578
1579	59				69	return $table;
1580						}
1581
1582						# Determine whether a given file is a valid Berkeley DB file.
1583						# It combines a fast preliminary check with a more thorough validation step for accuracy.
1584						# It looks for the magic number at both byte 0 and byte 12
1585						# TODO: Combine _db_0 and _db_12 as they are very similar routines
1586						sub _is_berkeley_db {
1587	17			17	30	my ($self, $file) = @_;
1588
1589						# Step 1: Check magic number
1590	17	50			189	open my $fh, '<', $file or return 0;
1591	0				0	binmode $fh;
1592
1593	0		0		0	my $is_db = (($self->_is_berkeley_db_0($fh)) \|\| ($self->_is_berkeley_db_12($fh)));
1594	0				0	close $fh;
1595
1596	0	0			0	if($is_db) {
1597						# Step 2: Attempt to open as Berkeley DB
1598
1599	0	0			0	require DB_File && DB_File->import();
1600
1601	0				0	my %bdb;
1602	0	0			0	if(tie %bdb, 'DB_File', $file, O_RDONLY, 0644, $DB_File::DB_HASH) {
1603						# untie %db;
1604	0				0	$self->{'berkeley'} = \%bdb;
1605	0				0	return 1; # Successfully identified as a Berkeley DB file
1606						}
1607						}
1608	0				0	return 0;
1609						}
1610
1611						# Determine whether a given file is a valid Berkeley DB file.
1612						# It combines a fast preliminary check with a more thorough validation step for accuracy.
1613						sub _is_berkeley_db_0
1614						{
1615	0			0	0	my ($self, $fh) = @_;
1616
1617						# Read the first 4 bytes (magic number)
1618	0	0			0	read($fh, my $magic_bytes, 4) == 4 or return 0;
1619
1620						# Unpack both big-endian and little-endian values
1621	0				0	my $magic_be = unpack('N', $magic_bytes); # Big-endian
1622	0				0	my $magic_le = unpack('V', $magic_bytes); # Little-endian
1623
1624						# Known Berkeley DB magic numbers (in both endian formats)
1625	0 0				0 0	my %known_magic = map { $_ => 1 } (
1626						0x00061561, # Btree
1627						0x00053162, # Hash
1628						0x00042253, # Queue
1629						0x00052444, # Recno
1630						);
1631
1632	0		0		0	return($known_magic{$magic_be} \|\| $known_magic{$magic_le});
1633						}
1634
1635						sub _is_berkeley_db_12
1636						{
1637	0			0	0	my ($self, $fh) = @_;
1638	0				0	my $header;
1639
1640	0	0			0	seek $fh, 12, 0 or return 0;
1641	0	0			0	read($fh, $header, 4) or return 0;
1642
1643	0				0	$header = substr(unpack('H*', $header), 0, 4);
1644
1645						# Berkeley DB magic numbers
1646	0		0		0	return($header eq '6115' \|\| $header eq '1561'); # Btree
1647						}
1648
1649						# Log and remember a message
1650						sub _log
1651						{
1652	185			185	237	my ($self, $level, @messages) = @_;
1653
1654						# FIXME: add caller's function
1655						# if(($level eq 'warn') \|\| ($level eq 'notice')) {
1656	185 185				126 500	push @{$self->{'messages'}}, { level => $level, message => join('', grep defined, @messages) };
1657						# }
1658
1659	185	50	33		326	if(scalar(@messages) && (my $logger = $self->{'logger'})) {
1660	185				364	$self->{'logger'}->$level(join('', grep defined, @messages));
1661						}
1662						}
1663
1664						sub _debug {
1665	137			137	225	my $self = shift;
1666	137				154	$self->_log('debug', @_);
1667						}
1668
1669						sub _trace {
1670	47			47	47	my $self = shift;
1671	47				71	$self->_log('trace', @_);
1672						}
1673
1674						# Emit a warning message somewhere
1675						sub _warn {
1676	0			0	0	my $self = shift;
1677	0				0	my $params = Params::Get::get_params('warning', @_);
1678
1679	0				0	$self->_log('warn', $params->{'warning'});
1680	0				0	Carp::carp(join('', grep defined, $params->{'warning'}));
1681						}
1682
1683						# Die
1684						sub _fatal {
1685	1			1	0	my $self = shift;
1686	1				2	my $params = Params::Get::get_params('warning', @_);
1687
1688	1				8	$self->_log('error', $params->{'warning'});
1689	1				1223	Carp::croak(join('', grep defined, $params->{'warning'}));
1690						}
1691
1692 - 1744						=head1 AUTHOR Nigel Horne, C<< <njh at nigelhorne.com> >> =head1 SUPPORT This module is provided as-is without any warranty. Please report any bugs or feature requests to C<bug-database-abstraction at rt.cpan.org>, or through the web interface at L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Database-Abstraction>. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. =head1 BUGS The default delimiter for CSV files is set to '!', not ',' for historical reasons. I really ought to fix that. It would be nice for the key column to be called key, not entry, however key's a reserved word in SQL. The no_entry parameter should be no_id. XML slurping is hard, so if XML fails for you on a small file force non-slurping mode with $foo = MyPackageName::Database::Foo->new({ directory => '/var/dat', max_slurp_size => 0 # force to not use slurp and therefore to use SQL }); =head1 SEE ALSO =over 4 =item * Test coverage report: L<https://nigelhorne.github.io/Database-Abstraction/coverage/> =back =head1 LICENSE AND COPYRIGHT Copyright 2015-2025 Nigel Horne. This program is released under the following licence: GPL2. Usage is subject to licence terms. The licence terms of this software are as follows: Personal single user, single computer use: GPL2 All other users (for example, Commercial, Charity, Educational, Government) must apply in writing for a licence for use from Nigel Horne at the above e-mail. =cut
1745
1746						1;